> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/sources/source-compliance-google-news.md).

# Source Compliance: Google News

> Implementation-time review completed 2026-05-27 and direct RSS path added 2026-06-18. Trend Finder has a configured Apify source declaration and a direct-first Google News RSS adapter for public result metadata. This review covers headline/snippet/link metadata only.

***

## Source Overview

| Field                   | Value                                                        |
| ----------------------- | ------------------------------------------------------------ |
| Source ID               | `google-ai-news`                                             |
| Source name             | Google News                                                  |
| Primary Apify candidate | `thescrappa/google-news-scraper`                             |
| Replaced candidate      | `easyapi/google-news-scraper`                                |
| Direct public path      | `https://news.google.com/rss/search`                         |
| Store page              | <https://apify.com/thescrappa/google-news-scraper>           |
| Verified on             | 2026-05-27                                                   |
| Input schema rechecked  | 2026-05-27                                                   |
| Default build/tag       | `latest`, build `1.0.1`                                      |
| Default run options     | 128 MB memory, 120 second Actor timeout                      |
| Pricing observed        | Pay per usage, around `$0.20 / 1,000 results` on input page  |
| Trend Finder status     | Reviewed declaration, fixture-backed and live tiny-validated |

The original `easyapi/google-news-scraper` candidate was reviewed first, but live validation on 2026-05-27 returned zero Dataset items for the starter input, the Actor's sample query, and simpler AI queries. Trend Finder replaced it with `thescrappa/google-news-scraper` after a same-day tiny validation returned five Dataset rows and five normalized public URLs.

The replacement Actor input schema reviewed on 2026-05-27 exposes `q` for keyword search, `gl` and `hl` two-letter locale codes, `page`, `start`, and `so` for date/relevance sort, plus token fields for topic, publication, section, story, and knowledge graph browsing. Trend Finder uses keyword search only and does not use token browsing in the starter config.

The input schema was rechecked on 2026-05-27 for bounded historical support. The current `thescrappa/google-news-scraper` schema does not expose `dateFrom`, `dateTo`, `time_period`, `time_period_min`, `time_period_max`, or equivalent bounded start/end date fields. `so: 1` is only a current-result sort by date, not a reviewed historical window.

The direct adapter reads current Google News RSS search metadata for reviewed AI keyword terms, capped at 35 result rows. It is a metadata-only RSS path. It does not resolve Google redirects, read cached article pages, fetch publisher article bodies, or use Google News tokens. When the direct path produces reviewed evidence, the matching Apify source is skipped for that run; when it is disabled, empty, timed out, rate limited, or offline, the reviewed Apify fallback remains eligible.

## Terms And Use Boundary

The source is approved only for public Google News result metadata and links to publisher articles. It is not approval for article extraction, cache reading, redirect resolution, or full-text copying. Apify storage, run IDs, Dataset IDs, and Apify-hosted item URLs are not durable browser evidence and must not be emitted to browser payloads.

Approved browser-safe evidence:

* Headline/title.
* Publisher article URL from `link` or `url`, or publisher URL from the RSS `source url` attribute when the item link is a blocked Google redirect.
* Short result snippet.
* Published timestamp from `iso_date`, `published_at`, or parseable `date` when stable.
* Source label from `source_name` or `source` as context only.

Excluded fields:

* Google redirect, Google cache, or `news.google.com` URLs until a safe resolver is reviewed.
* Full article bodies, extracted text, summaries that copy full content, or paywalled text.
* Thumbnails, base64 images, media URLs, or downloaded files.
* Author fields, contact fields, emails, profile URLs, story tokens, related-publication structures, request echoes, and private telemetry.
* Ranking position, sort order, request pagination, or Google News tokens as engagement or relevance evidence.
* Actor run URLs, Dataset IDs, or Apify-hosted item URLs.
* Full Scrappa key-value-store response records.
* Direct RSS request URLs with query details, response headers, raw XML payloads, and Google News redirect URLs.

## Retention And Attribution

| Policy               | Value                                                                                         |
| -------------------- | --------------------------------------------------------------------------------------------- |
| Storage location     | `src/data/live-data.json` and private cache snapshots                                         |
| Retention period     | `live-data.json` overwritten on each collection run; snapshots retained locally until deleted |
| Historical retention | Local snapshots only; no reviewed historical collection window                                |
| Deletion path        | Delete generated Trend Finder data and snapshots                                              |
| Attribution          | Show source ID `google-ai-news`, source label when available, and publisher article link      |

## Historical Window Stance

| Field                  | Value                                                          |
| ---------------------- | -------------------------------------------------------------- |
| Historical support     | Current-only                                                   |
| Source ID              | `google-ai-news`                                               |
| Safe override fields   | None                                                           |
| Unsupported reason     | Reviewed Actor input has no bounded start and end date fields. |
| Compliance declaration | `historicalWindowSupport.supported = false`                    |

Do not use Google News date fields from other Actor candidates as overrides for this source. Enabling historical windows requires a separate reviewed Actor contract that exposes explicit bounded start/end date inputs, fixture coverage, tiny validation, output timestamp stability checks, and updated retention notes.

## Compliance Checklist

* [x] Current Apify Actor ID verified on 2026-05-27
* [x] Input fields reviewed
* [x] Default build/tag and pricing model recorded
* [x] Headline/snippet/link-only browser evidence boundary documented
* [x] Google redirect/cache URLs rejected by fixture-backed tests
* [x] Apify storage excluded as durable browser evidence
* [x] Retention and deletion path documented
* [x] Attribution requirements documented
* [x] Historical-window stance recorded as current-only
* [x] Live tiny capped validation produced public URLs with credentials
* [x] Bounded historical date-field review closed as unsupported for the current Actor contract
* [x] Direct RSS adapter added with zero-cost spend labeling
* [x] Direct RSS adapter blocks Google redirect/cache URLs and uses publisher URLs only when a public publisher URL is exposed
* [x] Matching Apify fallback remains eligible when direct collection does not produce reviewed evidence

Live note: 2026-05-27 tiny validation for `thescrappa/google-news-scraper` returned five Dataset items and five normalized public URLs with the reviewed keyword input. The raw Dataset shape includes `authors`, thumbnails, story tokens, and request echo fields; these are excluded from browser and analyst evidence.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/sources/source-compliance-google-news.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
