> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/sources/source-compliance-hackernews.md).

# Source Compliance: Hacker News

> Review completed 2026-05-15 before Firebase top-stories collection. HN Algolia keyword-search re-review completed 2026-06-14 for current-only story metadata. The existing top-stories path remains a safe fallback.

***

## Source Overview

| Field                 | Value                                                                                       |
| --------------------- | ------------------------------------------------------------------------------------------- |
| Source Name           | Hacker News (Y Combinator)                                                                  |
| API Base URL          | `https://hacker-news.firebaseio.com/v0/` and `https://hn.algolia.com/api/v1/search_by_date` |
| API Documentation     | <https://github.com/HackerNews/API> and <https://hn.algolia.com/api>                        |
| Authentication        | None required (public API)                                                                  |
| Data Format           | JSON                                                                                        |
| First Adapter Session | phase02-session06                                                                           |
| Default Source ID     | `hackernews`                                                                                |

***

## Terms of Service

Hacker News does not publish a dedicated API Terms of Service document. The Firebase API is provided as a public endpoint maintained by Y Combinator. The Algolia endpoint is a public HN search endpoint used for keyword search. Usage is governed by the general HN guidelines at <https://news.ycombinator.com/newsguidelines.html> and by conservative client rate behavior.

**Key obligations**:

* Do not scrape the website; use the documented Firebase or Algolia endpoints.
* Do not misrepresent content origin.
* Do not use the API to spam or manipulate rankings.
* Attribution to Hacker News is expected when displaying content.

***

## Rate Limits

The Firebase API documentation does not specify formal rate limits. The Algolia HN API page states that requests from a single IP are rate-limited. Trend Finder therefore uses conservative sequential request behavior for both HN paths.

| Parameter                   | Value                                                                   |
| --------------------------- | ----------------------------------------------------------------------- |
| Recommended request cadence | Max 1 request per second                                                |
| Burst tolerance             | Low; Firebase may throttle                                              |
| Retry strategy              | Exponential backoff, 3 attempts max                                     |
| Implementation              | Sequential item fetches or keyword searches with 1-second minimum delay |

The adapter enforces a minimum 1000ms delay between individual item detail requests. The top-stories endpoint is called once per collection run.

## HN Algolia Keyword Search Boundary

The 2026-06-14 re-review approves keyword search only under these conditions:

* Use `https://hn.algolia.com/api/v1/search_by_date` with `tags=story`, reviewed keyword-window terms, `hitsPerPage` caps, and current-only date filters.
* Keep the stable source ID `hackernews`, source role `discussion`, and quality tier `primary`.
* Normalize only story title, story URL or HN item URL, points, comment count, story ID, and created timestamp.
* Exclude author usernames, user profiles, comment text, story text, raw highlighted snippets, raw Algolia JSON, and query trace details from browser payloads, traces, and logs.
* Treat 403, 429, timeout, malformed JSON, or empty keyword search responses as degraded or offline readiness before falling back to the existing Firebase top-stories path.
* Emit a zero-cost public API spend label for direct HN rows.

***

## Data Collected

| Data Element      | HN API Field  | Stored As                       | PII Risk |
| ----------------- | ------------- | ------------------------------- | -------- |
| Story title       | `title`       | TrendEvidenceItem.title         | None     |
| Story URL         | `url`         | TrendEvidenceItem.url           | None     |
| Story score       | `score`       | Used in relevanceScore calc     | None     |
| Story time        | `time` (unix) | TrendEvidenceItem.publishedAt   | None     |
| Story ID          | `id`          | TrendEvidenceItem.id (prefixed) | None     |
| Story descendants | `descendants` | Used in relevanceScore calc     | None     |

**Not collected**: Author usernames (`by` or `author` fields), comment text, story text, user profiles, email addresses, or any other user-identifying information.

***

## Data Retention

| Policy               | Value                                                                                         |
| -------------------- | --------------------------------------------------------------------------------------------- |
| Storage location     | `src/data/live-data.json` and private cache snapshots                                         |
| Retention period     | `live-data.json` overwritten on each collection run; snapshots retained locally until deleted |
| Historical retention | Local snapshots only; no HN historical source window support                                  |
| Deletion path        | Delete generated Trend Finder data and snapshots                                              |
| Backup               | None                                                                                          |

***

## Phase 14 Historical Window Stance

| Field                  | Value                                                                                                                      |
| ---------------------- | -------------------------------------------------------------------------------------------------------------------------- |
| Historical support     | Current-only                                                                                                               |
| Source ID              | `hackernews`                                                                                                               |
| Safe override fields   | None                                                                                                                       |
| Unsupported reason     | The adapter collects current top stories or current Algolia search results and has no bounded historical archive contract. |
| Compliance declaration | `historicalWindowSupport.supported = false`                                                                                |

Do not emulate historical HN collection by walking current story IDs or cached local output. A separate reviewed source is required before Hacker News can participate in historical backtests.

***

## Privacy and GDPR Assessment

| Criterion                 | Status | Notes                                          |
| ------------------------- | ------ | ---------------------------------------------- |
| PII collected             | No     | Author names (`by`) are intentionally excluded |
| User consent needed       | No     | Public data, no user accounts in Trend Finder  |
| Data subject rights       | N/A    | No personal data stored                        |
| Cross-border transfer     | N/A    | Data stays on local machine                    |
| Data processor agreements | N/A    | No third-party processing                      |
| Legitimate interest basis | Yes    | Self-use trend analysis from public data       |

***

## Attribution

When displaying HN-sourced evidence in the UI, items include:

* Source identifier: `hackernews`
* Link to original story URL when available
* Source name displayed as "Hacker News" in source summaries

***

## Risk Assessment

| Risk                     | Likelihood | Impact | Mitigation                                           |
| ------------------------ | ---------- | ------ | ---------------------------------------------------- |
| API endpoint changes     | Low        | Medium | Version-pin base URL; adapter returns offline on 404 |
| Rate limiting/throttling | Medium     | Low    | 1s delay between requests; exponential backoff       |
| Data quality degradation | Low        | Low    | Validate response shape; skip malformed items        |
| Service unavailability   | Low        | Low    | Adapter returns offline status; collector continues  |

***

## Compliance Checklist

* [x] Terms of service reviewed
* [x] Rate limits documented and enforced in adapter
* [x] Data retention policy defined
* [x] PII assessment completed (none collected)
* [x] GDPR assessment completed (N/A -- no personal data)
* [x] Attribution requirements documented
* [x] Risk assessment completed
* [x] Phase 14 historical-window stance recorded as current-only
* [x] HN Algolia keyword-search boundary re-reviewed on 2026-06-14
* [x] HN Algolia field exclusions and fallback stance recorded

***

*This document must be reviewed again before changing the Hacker News source boundary or adding historical support.*


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/sources/source-compliance-hackernews.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
