> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/phases/phase_29/session_16_podcast_compliance_package.md).

# Session 16: Podcast Compliance Package

**Session ID**: `phase29-session16-podcast-compliance-package` **Status**: Complete **Estimated Tasks**: \~12-25 **Estimated Duration**: 2-4 hours

***

## Objective

Decide whether podcast transcription can be collected, cached, and summarized under this project's compliance posture. Maps comparison item 2.1, compliance gate only. This is a compliance/documentation session, not an implementation session.

***

## Scope

### In Scope (MVP)

* Create a podcast/audio source compliance doc (likely `docs/sources/source-compliance-podcasts.md`).
* Define transcript-retention policy, spend label, allowed attribution fields, private cache rules, and browser-payload boundary.
* State explicitly whether raw transcript bodies, speaker names, comments, thumbnails, and media URLs are blocked or allowed.
* Land an explicit approve/defer/reject decision and update source docs and media policy if the boundary is approved.

### Out of Scope

* Any collector, adapter, or transcription implementation (Session 17).
* Approving a boundary by implication; the decision must be explicit.

***

## Implementation Decision

**Implemented**: 2026-06-21 **Completed**: 2026-06-21 **Decision**: Defer **Compliance doc**: `docs/sources/source-compliance-podcasts.md`

Podcast/audio collection, audio/video download, transcription, transcript retention, transcript summarization, podcast theme clustering, browser payloads, source declarations, provider integrations, new env keys, and new dependencies are not approved in Phase 29.

Session 17 is deferred. A future session may reopen the path only with a new source-specific review that names the source/provider and approves terms, media rights, cache, retention, attribution, spend, parser fixtures, and leak tests before implementation.

***

## Prerequisites

* [x] Existing source-compliance and media-policy docs available for reference.

***

## Deliverables

1. Podcast/audio source compliance doc with retention, attribution, cache, and payload-boundary rules.
2. Explicit approve/defer/reject decision recorded.
3. Source-doc and media-policy updates if the boundary is approved.

***

## Success Criteria

* [x] The compliance doc lands with an explicit approve/defer/reject decision.
* [x] Transcript bodies, speaker names, comments, thumbnails, and media URLs are each explicitly classified as blocked or allowed.
* [x] Session 17 is either unblocked or marked deferred/rejected accordingly.

***

## Key Files

* `docs/sources/source-compliance-podcasts.md`
* `docs/extensions/trend-finder-sources.md`
* `docs/media-policy.md`
* `docs/apify.md` or a new source-onboarding doc if a provider is chosen

***

## Comparison Notes (folded from comparison plan)

**Effort:** Medium (compliance gate only). **Boundary risk:** High. This session is the compliance/documentation gate for backlog item 2.1; implementation is Session 17 and must not start unless this session explicitly approves the boundary.

**2.1 Podcast transcription and cross-show theme clustering (gate portion).** TrendingAI uses `yt-dlp`, `ffmpeg`, and Groq Whisper to transcribe hundreds of hours of AI podcasts, then clusters themes recurring across at least two shows. Podcast discourse is a qualitatively different leading indicator, but audio download, transcription, transcript retention, attribution, private caching, and spend labeling need a dedicated compliance decision before any code runs. A bounded Trend Finder version would publish only theme label, summary, and per-episode `{show, url, angle}`, never transcript bodies - but that boundary must be explicitly approved here first.

**Final decision.** Session 16 defers podcast/audio implementation. Session 17 must not run in Phase 29, and the backlog item remains compliance-gated until a future source-specific approval exists.

**TrendingAI source pointers.** `src/podcast-themes.ts`, `prompts/podcast-themes.md`, `scripts/hydrate-ig-transcripts.ts`.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/phases/phase_29/session_16_podcast_compliance_package.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
