> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/extensions/trend-finder/sources.md).

# Trend Finder Sources

This guide explains how Trend Finder treats built-in adapters, reviewed source metadata, configured source runs, source health, and source trust.

For the implementation checklist used when adding another Apify-backed source, see [Adding Apify Sources To Trend Finder](/ai-os-and-trend-finder-docs/docs/sources/apify-source-onboarding.md).

## Implemented Source Model

Trend Finder source language has three layers that are easy to mix up.

| Layer                    | Current Meaning                                                                                                                                                            |
| ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Built-in direct adapter  | Code that runs without Apify against reviewed public APIs or reviewed public feeds. Direct adapters run before Apify fallbacks and attach zero-cost public API spend rows. |
| Static Apify declaration | Reviewed metadata for a source ID, actor candidate, role, quality tier, compliance doc, caps, and historical support. It does not run by itself.                           |
| Configured Apify source  | A source listed in the Apify source JSON file or inline override and allowed by the compliance gate. It runs only when `APIFY_TOKEN` and extension enablement are ready.   |

The shipped direct adapters are:

| Source ID                | Direct Path                             | Fallback Behavior                                                                                     |
| ------------------------ | --------------------------------------- | ----------------------------------------------------------------------------------------------------- |
| `arxiv-ai-papers`        | arXiv Atom metadata API                 | Matching Apify source is eligible only when direct collection fails or is disabled.                   |
| `github-ai-repositories` | GitHub REST repository search metadata  | Matching Apify source is eligible only when direct collection fails, is disabled, or is rate limited. |
| `rss-ai-news`            | Reviewed public RSS/Atom feed allowlist | Matching Apify source is eligible only when direct collection fails or is disabled.                   |
| `hackernews`             | HN Algolia keyword story metadata       | Falls back to the existing Firebase top-stories path, not Apify.                                      |
| `huggingface-ai-models`  | Hugging Face Hub model metadata API     | Matching Apify source is eligible only when direct collection fails or is disabled.                   |
| `google-ai-news`         | Google News RSS search metadata         | Matching Apify source is eligible only when direct collection fails or is disabled.                   |
| `devto-ai-articles`      | Dev.to public articles API              | Direct-only reviewed source; no Apify fallback is declared.                                           |

Phase 28 source closeout has two shipped source-expansion paths:

* keyword packs compile reviewed search terms into reviewed Apify target fields, respecting scan mode, rotation, source caps, text limits, and spend labels
* direct first-party adapters collect only reviewed public metadata for arXiv, GitHub repository search, reviewed RSS feeds, HN Algolia keyword search, Hugging Face Hub models, Google News RSS, and Dev.to public articles

When direct collection succeeds, matching Apify fallback rows are skipped before a paid run. The skipped fallback remains visible through readiness, fallback, and zero-cost public API spend labels so operators can see why an Actor did not run without implying an Actor charge. When direct collection is disabled, empty, rate-limited, timed out, or offline, the reviewed Apify declaration remains eligible only if its compliance gate, credentials, and configuration allow it.

The reviewed Apify declarations currently cover these source IDs:

| Source ID                   | Role       | Quality Tier | Notes                                                                                                         |
| --------------------------- | ---------- | ------------ | ------------------------------------------------------------------------------------------------------------- |
| `github-ai-repositories`    | developer  | primary      | Public repository trend metadata.                                                                             |
| `reddit-ai-discussions`     | discussion | community    | Public discussion metadata.                                                                                   |
| `arxiv-ai-papers`           | research   | primary      | Public research metadata.                                                                                     |
| `producthunt-ai-launches`   | launch     | secondary    | Public launch metadata; first reviewed source with backtest windows.                                          |
| `youtube-ai-creator-videos` | creator    | low          | Public creator video metadata; capped aggressively for scoring.                                               |
| `rss-ai-news`               | news       | secondary    | Public feed metadata.                                                                                         |
| `huggingface-ai-papers`     | research   | primary      | Public paper metadata and community paper attention.                                                          |
| `huggingface-ai-models`     | developer  | primary      | Public model metadata and model-like signals.                                                                 |
| `reddit-ai-subreddit-feed`  | discussion | community    | Public post metadata from known subreddit feeds.                                                              |
| `google-ai-news`            | news       | secondary    | Public Google News result metadata with publisher links only; replacement Actor live-validated on 2026-05-27. |

The reviewed direct-only declarations currently cover these source IDs:

| Source ID           | Role      | Quality Tier | Notes                                                                 |
| ------------------- | --------- | ------------ | --------------------------------------------------------------------- |
| `devto-ai-articles` | developer | secondary    | Public article metadata from reviewed Dev.to AI tags; no author rows. |

Important source rules:

* Static declarations are not runtime defaults. They document reviewed source metadata and enrich configured sources.
* Direct adapters are runtime defaults only after their source compliance docs are reviewed and their per-source enable flags allow collection.
* Unknown or unreviewed Apify sources are restricted by the compliance gate.
* Known placeholder Actor IDs such as `apify/github-public-search` are blocked instead of treated as real public Actors.
* When a direct adapter produces reviewed evidence, the matching Apify fallback source row is skipped before a paid run to prevent duplicate collection.
* Source quality tier is a trust/cap signal for scoring and collection, not a claim that every item from that source is factually correct.
* Source roles describe the type of signal: developer, discussion, research, launch, creator, or news.
* The browser shows source status and provenance so partial coverage is visible instead of hidden.
* Current runs write generated browser data to `src/data/live-data.json` and private Trend Finder snapshots under `.cache/extensions/trend-finder/`. Compliance deletion paths must include both locations.

Apify source item caps are tied to quality tier before scoring:

| Quality Tier | Item Cap |
| ------------ | -------- |
| primary      | 50       |
| secondary    | 35       |
| community    | 20       |
| low          | 8        |

The normal product path is current-run oriented. Historical backtests are a separate script-only path and currently require reviewed historical window support, with `producthunt-ai-launches` as the first supported source.

## Keyword Scan Modes And Coverage QA

Phase 28 adds a deterministic keyword-pack compiler for reviewed Apify source query fields. The compiler is script-side source metadata, not a browser keyword editor. It never accepts free-text browser keywords or raw Actor input.

Supported scan modes:

| Mode       | Behavior                                                                                              |
| ---------- | ----------------------------------------------------------------------------------------------------- |
| `balanced` | Default. Selects the stable core plus a rotating tail from every reviewed category.                   |
| `focused`  | Requires a reviewed category. Selects the stable core plus a larger rotating tail from that category. |

The stable core covers broad AI signals such as agents, AI coding, local LLMs, multimodal AI, video, automation, open source AI, RAG, workflow, and model routing. It also includes reviewed canonical-topic seed terms such as MCP, GraphRAG, reasoning models, and AI coding agents so source queries and identity stabilization use the same reviewed metadata. Tail terms rotate deterministically from the run date so scheduled runs broaden coverage without changing source caps or spend settings.

Reviewed category IDs are:

```
agents, coding, video, image, audio, open-source, local-llm, automation,
research, marketing, business, education, security, robotics, general-ai
```

`focused` mode must name one of those categories. Invalid mode values, missing focused categories, and unreviewed category names fall back to `balanced` and produce a bounded warning in collector output, Engine Trace, and Source Setup coverage QA.

Keyword injection is allowed only for reviewed public query target fields that are declared on the static source setup metadata:

| Source ID                   | Field           | Shape     |
| --------------------------- | --------------- | --------- |
| `reddit-ai-discussions`     | `query`         | text      |
| `arxiv-ai-papers`           | `query`         | text      |
| `producthunt-ai-launches`   | `searchTerms`   | text list |
| `youtube-ai-creator-videos` | `searchQueries` | text list |
| `huggingface-ai-papers`     | `searchQuery`   | text      |
| `huggingface-ai-models`     | `query`         | text      |
| `google-ai-news`            | `q`             | text      |

Sources without a reviewed keyword target keep their existing input unchanged and show a skipped source-cap summary. Keyword compilation clones runtime source objects before applying terms, so static declarations are not mutated.

Coverage QA appears in Source Setup and Engine Trace as bounded labels:

| Label           | Meaning                                                                 |
| --------------- | ----------------------------------------------------------------------- |
| `Ready`         | The reviewed category has enough depth for the scan window.             |
| `Review`        | The category is usable but should be expanded after result review.      |
| `Thin coverage` | The reviewed category has shallow keyword coverage for the scan window. |

Source cap summaries show applied keyword counts against each reviewed source keyword cap. The compiler also respects text-field length limits, source `maxItems`, quality-tier item caps, optional charge ceilings, and existing spend labels. It does not change source spend state, cache state, Actor IDs, Dataset IDs, or browser-safe evidence boundaries.

## Hugging Face Download Deltas

`huggingface-ai-models` already collects public model metrics, including download counts when present in reviewed Actor output. Phase 27 added run-over-run download deltas without adding a new source adapter.

The collector stores private model download observations in the Trend Finder snapshot history, compares the current public download count with the previous observation for the same model identity, and publishes only a bounded `downloadDelta` label on browser-safe evidence rows. The states are explicit:

| State         | Meaning                                                    |
| ------------- | ---------------------------------------------------------- |
| `available`   | A prior public observation exists and the delta is finite. |
| `no-baseline` | Current downloads exist, but no prior observation exists.  |
| `unavailable` | The source row did not include a usable download metric.   |
| `invalid`     | The metric could not be normalized into a bounded count.   |

Download deltas do not expose private snapshot paths, raw Actor rows, Dataset IDs, or account details. They are support context for evidence inspection and Workbench sorting, not a new source-health row.

## Source Environment Keys

Trend Finder source collection currently reads these source-related env names:

| Env key                                               | Purpose                                                              |
| ----------------------------------------------------- | -------------------------------------------------------------------- |
| `VITE_CLAUDE_OS_ENABLED_EXTENSIONS`                   | Enables the `trend-finder` extension collector.                      |
| `TREND_FINDER_HN_ITEMS`                               | Caps HN top-story IDs fetched before AI filtering.                   |
| `FINDTREND_DIRECT_SOURCES_ENABLED`                    | Global switch for reviewed direct public API/feed adapters.          |
| `FINDTREND_DIRECT_ARXIV_ENABLED`                      | Per-source switch for direct arXiv metadata collection.              |
| `FINDTREND_DIRECT_GITHUB_ENABLED`                     | Per-source switch for direct GitHub repository metadata collection.  |
| `FINDTREND_DIRECT_RSS_ENABLED`                        | Per-source switch for reviewed direct RSS/feed collection.           |
| `FINDTREND_DIRECT_HN_ALGOLIA_ENABLED`                 | Per-source switch for HN Algolia keyword metadata collection.        |
| `FINDTREND_DIRECT_HF_ENABLED`                         | Per-source switch for direct Hugging Face model metadata collection. |
| `FINDTREND_DIRECT_GOOGLE_NEWS_ENABLED`                | Per-source switch for direct Google News RSS metadata collection.    |
| `FINDTREND_DIRECT_DEVTO_ENABLED`                      | Per-source switch for direct Dev.to article metadata collection.     |
| `GITHUB_TOKEN`                                        | Optional script-only token for GitHub repository search rate limits. |
| `APIFY_TOKEN`                                         | Script-only Apify credential.                                        |
| `FINDTREND_APIFY_ACTORS_PATH`                         | Optional path to the Apify source JSON file.                         |
| `FINDTREND_APIFY_ACTORS`                              | Optional inline source config override.                              |
| `FINDTREND_APIFY_DEFAULT_MEMORY_MB`                   | Default Apify Actor memory.                                          |
| `FINDTREND_APIFY_DEFAULT_TIMEOUT_SECS`                | Default Apify Actor timeout.                                         |
| `FINDTREND_APIFY_MAX_ITEMS_PER_SOURCE`                | Default Dataset item cap per source.                                 |
| `FINDTREND_KEYWORD_SCAN_MODE`                         | Optional script-side keyword scan mode: `balanced` or `focused`.     |
| `FINDTREND_KEYWORD_SCAN_CATEGORY`                     | Required only for `focused`; must be a reviewed category ID/label.   |
| `FINDTREND_GOOGLE_TRENDS_DEMAND_ENABLED`              | Opts into paid metric-only Google Trends demand enrichment.          |
| `FINDTREND_GOOGLE_TRENDS_DEMAND_TERMS`                | Optional demand terms; otherwise Creator Lens focus terms are used.  |
| `FINDTREND_GOOGLE_TRENDS_DEMAND_GEO`                  | Optional Google Trends geo code for demand enrichment.               |
| `FINDTREND_GOOGLE_TRENDS_DEMAND_TIME_RANGE`           | Optional Google Trends time range; omitted uses the Actor default.   |
| `FINDTREND_GOOGLE_TRENDS_DEMAND_MAX_ITEMS`            | Demand term cap, clamped to at most 5.                               |
| `FINDTREND_GOOGLE_TRENDS_DEMAND_MAX_TOTAL_CHARGE_USD` | Demand Actor charge ceiling; lower caps may abort.                   |

AI runtime env names are documented separately in the AI runtime guide because they affect analysis, not source selection.

`google-trends-demand` is not included in the reviewed source declaration table because it does not emit browser evidence or source-health rows. It is an internal scoring enrichment path documented in `docs/sources/source-compliance-google-trends-demand.md`.

## Source Health And Provenance

Source health matters because Trend Finder should expose when a run is working from strong coverage versus partial or fallback data.

| Source Status | Meaning                                        |
| ------------- | ---------------------------------------------- |
| active        | Source contributed current evidence.           |
| degraded      | Source contributed partially or with warnings. |
| offline       | Source did not contribute current evidence.    |

Source-death warnings distinguish a quiet run from a configured source that was previously live and now produced zero accepted evidence. The collector compares current accepted-evidence counts with a private last-good per-source baseline. Only configured, enabled, credential-ready, reviewed sources that are active or degraded in the current run are eligible.

These states do not produce source-death warnings:

* first-run sources with no private baseline
* disabled or unconfigured sources
* sources missing credentials
* restricted or placeholder declarations
* offline, blocked, fallback, or skipped direct-vs-Apify fallback rows

The browser can show only a safe source-death label and aggregate alarm count. It must not show the private baseline path, baseline file name, prior accepted count, raw source diagnostics, tokens, or credential-shaped strings.

Direct readiness is resolved once before a collection run starts and is frozen for that run. Browser-visible readiness labels are bounded and do not include raw provider payloads, private URLs, headers, tokens, or stack traces.

| Direct Readiness | Meaning                                                           |
| ---------------- | ----------------------------------------------------------------- |
| ready            | The direct source is reviewed, enabled, and may collect.          |
| disabled         | A global or per-source direct flag disabled collection.           |
| blocked          | Readiness was requested too late or compliance is not reviewed.   |
| degraded         | The direct source returned partial evidence or warnings.          |
| rate-limited     | The direct provider reported an exhausted or blocked rate window. |
| timeout          | The direct request timed out after the reviewed retry policy.     |
| empty            | The direct provider returned no usable reviewed metadata rows.    |
| offline          | The direct request failed and fallback handling applies.          |

Provenance appears at several levels. `dataState` describes the generated data branch, `analysisState` describes AI-vs-fallback analysis, and `sourceState` or per-source `provenance` describes source collection.

| Provenance State | Meaning                                                     |
| ---------------- | ----------------------------------------------------------- |
| live             | Collected from configured reviewed public sources.          |
| fixture-demo     | Committed or hand-authored demo data.                       |
| degraded         | Partial source collection with warnings.                    |
| blocked          | Skipped because credentials or reviewed config are missing. |
| fallback         | Fallback behavior produced the row.                         |
| unknown          | Legacy payload without explicit provenance.                 |

The Sources tab also shows:

* source-role matrix: role coverage tied to linked evidence and source health
* evidence contribution: which sources are carrying scored topic evidence
* source trust impact: active/degraded/offline/item counts
* per-source enrichment cache state and spend state, using only safe counts, labels, and capped USD values
* local Source Setup state for reviewed Apify sources, including credential presence, private config presence, reviewed editable targets, safe warnings, last-run health, direct readiness, and direct-vs-Apify fallback status
* keyword coverage QA for script-selected scan mode, reviewed categories, source cap counts, and bounded keyword warnings

## Local Source Setup UI

The Sources tab includes a local-only Source Setup panel when the app is served by the Vite dev server. The panel reads and mutates the gitignored private `data/trend-finder.apify-actors.json` file through the token-gated loopback bridge at `/__trend_finder_source_setup`. It is not a hosted configuration surface.

The browser can see only:

* reviewed source IDs and display names
* whether the private config is loaded, missing, or invalid
* whether the script-side `APIFY_TOKEN` is present
* reviewed target field labels and safe target summaries
* source caps, compliance doc paths, warnings, and last-run source health
* keyword scan coverage mode, category coverage statuses, source cap counts, and bounded keyword warnings
* direct readiness labels and fallback labels for Apify declarations that have reviewed direct public API equivalents

The browser must not display full credentials, private config paths, raw Actor input JSON, Dataset rows, Actor run logs, account IDs, auth headers, token-like strings, raw direct provider payloads, direct request URLs with query details, or unreviewed source IDs as editable setup rows.

Editable targets are allowlisted per static source declaration. Examples include public search terms, public subreddit names, public feed URLs, relative current filters, locale codes, and Product Hunt `dateFrom`/`dateTo` fields. Raw JSON edits, arbitrary Actor fields, private feeds, private accounts, comment-body collection, and unsupported historical-window shortcuts are blocked.

Saving setup changes is additive and reviewed-source scoped. The bridge writes only the reviewed source IDs and target fields it understands, preserves unrelated parser-supported config, and rejects unknown sources or malformed targets instead of silently widening collection. Adding a new adapter, target field, media field, or historical window still requires a source compliance review before it becomes an editable browser control.

## Source Spend And Cache State

Source rows can include an enrichment cache summary and a spend summary. These fields are operational labels, not raw source records.

Cache state is keyed from stable source item identity and metadata fingerprints. The browser may show:

| Cache Label   | Meaning                                                        |
| ------------- | -------------------------------------------------------------- |
| saved or hits | Unchanged evidence reused a private safe enrichment summary.   |
| misses        | Current evidence had no matching fresh cache entry.            |
| degraded      | Cache validation or pruning had errors, but the run continued. |
| no eligible   | Evidence did not have the stable identity needed for caching.  |
| unavailable   | Legacy payload or source row did not include cache state.      |

Spend state is per source and per run. The browser may show:

| Spend Label    | Meaning                                                                    |
| -------------- | -------------------------------------------------------------------------- |
| exact          | The runner reported actual `usageTotalUsd`.                                |
| estimated      | Exact usage was unavailable; value comes from configured charge caps.      |
| mixed          | The run combines exact source usage and configured cap estimates.          |
| not applicable | Public or built-in source path does not use a paid runner.                 |
| unavailable    | A paid source had neither exact usage nor a configured cap in the payload. |

Direct arXiv, GitHub, RSS, HN, Hugging Face, Google News, and Dev.to public API/feed paths use the `public-api` provider and the `not-applicable` state with the reason "Direct public API source does not use a paid runner." Skipped Apify fallbacks also use a zero-cost public API spend row so aggregate spend keeps the skipped fallback visible without implying an Actor charge.

The Sources tab never displays raw billing payloads, account IDs, tokens, Actor IDs, Dataset IDs, raw source rows, or private cache paths.

Spend and cache labels describe the latest generated payload. They are not live account billing dashboards and they do not prove that the next scheduled run uses the same amount. Recurring spend uses the reviewed cadence plus the latest safe per-run summary and must keep estimate/exact/mixed states visible.

Keyword scan rotation preserves these spend labels. A compiled keyword window can reduce terms to fit a reviewed target or source cap, but it must not widen `maxItems`, alter `maxTotalChargeUsd`, hide exact/estimated/mixed spend state, or turn a skipped keyword target into a paid source.

## Phase 29 Source Boundary Closeout

Phase 29 did not approve any new source boundary by implication. Industry events use only current reviewed `rss-ai-news` and `google-ai-news` evidence and publish only after at least two independent publisher identities support the event cluster. The security lens reads only existing normalized evidence and reviewed security keyword categories; it is not a security feed, dependency scanner, article-body reader, comment-body classifier, or remediation engine.

Source-death alarms compare current accepted evidence with a private last-good baseline for configured, reviewed, credential-ready sources. Browser payloads receive only safe alarm counts and labels. Seed-candidate review stays private and produces review artifacts for maintainers; it does not add a source row, collector, browser payload branch, or public discovery feed.

The closeout decision for podcast/audio is still the Session 16 `Defer` decision. Session 17 is skipped in Phase 29, and no podcast metadata, transcript, media, provider, cache, spend, parser, source setup target, env key, static Brief section, or browser payload work is approved by this manual.

Broader social reach from X/Twitter, TikTok, Instagram, and Bluesky remains a deliberate non-goal under the current API, terms, PII, retention, parser, spend, and browser-safety posture. Candidate names in this manual, the PRD, or coverage artifacts are backlog anchors only.

## Podcast And Audio Boundary

Podcast/audio themes are not implemented Trend Finder behavior. Phase 29 Session 16 reviewed the boundary in [`docs/sources/source-compliance-podcasts.md`](/ai-os-and-trend-finder-docs/docs/sources/source-compliance-podcasts.md) and deferred Session 17 because no source/provider path is approved for audio download, transcription, transcript retention, transcript summarization, and browser-safe theme publication together.

Until a future source-specific review changes that decision, Trend Finder must not add a podcast source declaration, podcast RSS allowlist, Apify podcast Actor, audio/video downloader, speech-to-text provider, transcript cache, podcast snapshot, podcast theme browser payload, static Brief podcast section, Source Setup podcast target, or podcast env key.

The only podcast fields classified as potentially browser-safe in a future approval are public attribution fields such as show name, episode title, publisher/feed label, and canonical public episode or show URL. Those fields are not approved for runtime collection in Phase 29. Transcript bodies, speaker diarization, comments, thumbnails, artwork files, media URLs, raw audio, raw video, provider responses, prompts, credentials, private paths, Actor IDs, Dataset IDs, and run IDs remain blocked.

## Deferred Source Coverage Candidates

Phase 27 preserved the Alpha Radar source-gap list as future compliance-first work, and Phase 28 preserves the Trends-Finderz source review anchors the same way. These candidates are not live Trend Finder sources until a future phase adds source-specific compliance docs, reviewed caps, parser tests, source setup boundaries, spend labels, retention rules, and security/GDPR review.

Candidate source names in PRDs, coverage artifacts, or manuals are not collection approvals. They are backlog anchors for a future compliance-first source phase.

Broader social reach from X/Twitter, TikTok, Instagram, and Bluesky is a known coverage gap and a deliberate non-goal for the current Trend Finder source posture. These platforms are not collected, configured, or partially approved by this manual. Any future activation would require source-specific terms, API, PII, retention, parser, spend, and browser-safety review before a source row or collector can exist.

| Candidate Signal                   | Current Status                                                          | Closeout Decision                                                                                                                       |
| ---------------------------------- | ----------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| Semantic Scholar citation velocity | Not collected; arXiv covers reviewed research-role metadata.            | Future research adapter candidate.                                                                                                      |
| Bluesky mentions                   | Not collected.                                                          | Explicit non-goal under current API, terms, and PII posture.                                                                            |
| Instagram creator posts            | Not collected.                                                          | Explicit non-goal under current API, terms, and PII posture.                                                                            |
| Replicate run-count deltas         | Not collected; Hugging Face models covers developer-role model signals. | Future developer adapter candidate, likely requiring snapshot persistence.                                                              |
| Newsletter signals                 | Mostly covered by reviewed `rss-ai-news` feed metadata.                 | Add specific newsletter feeds only through reviewed RSS source setup targets.                                                           |
| Podcast/audio themes               | Deferred by Phase 29 Session 16.                                        | Future source-specific review required before any metadata, transcript, media, provider, cache, spend, parser, or browser payload work. |
| X/Twitter posts                    | Not collected.                                                          | Explicit non-goal under current public API/terms posture.                                                                               |
| TikTok creator posts               | Not collected.                                                          | Explicit non-goal under current API, terms, and PII posture.                                                                            |
| Digg AI mentions                   | Not collected.                                                          | Low-priority candidate; skip unless a creator need appears.                                                                             |
| Hugging Face download deltas       | Implemented through existing `huggingface-ai-models` public metrics.    | No new source adapter required.                                                                                                         |

## Source-Local Identity And Placement

Source-local scoring uses only reviewed public source context. It does not add new source calls and it does not read raw Actor payloads in browser code.

Current reviewed entity fields are:

| Source ID                   | Entity Used For Baseline                | Notes                                     |
| --------------------------- | --------------------------------------- | ----------------------------------------- |
| `github-ai-repositories`    | Public repository full name             | Used as repository identity.              |
| `reddit-ai-discussions`     | Public subreddit/community label        | Authors and profiles remain excluded.     |
| `reddit-ai-subreddit-feed`  | Public subreddit/community label        | Comment rows remain rejected.             |
| `producthunt-ai-launches`   | Public product or launch slug           | Maker/user details remain excluded.       |
| `youtube-ai-creator-videos` | Public channel title                    | Authorized account data is not used.      |
| `huggingface-ai-models`     | Public model ID                         | Profile/contact fields remain excluded.   |
| `rss-ai-news`               | Public feed, source, or publisher label | Per-feed approvals still apply.           |
| `google-ai-news`            | Public source or publisher label        | Google-hosted redirect URLs stay blocked. |
| `devto-ai-articles`         | Reviewed Dev.to tag                     | User/profile fields remain excluded.      |

Hacker News remains unsupported for source-local baselines because the current review does not approve author identity for grouping. arXiv and Hugging Face Papers also remain unsupported for source-local entity grouping in this pass.

Placement handling is conservative. Normalizers treat clear public row fields such as pinned, stickied, promoted, sponsored, or ad-like markers as excluded from baseline calculation. Unknown placement is not guessed. Engine Replay shows only aggregate excluded and down-weighted counts, never raw source rows.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/extensions/trend-finder/sources.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
