> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase24-session01-source-local-scoring-signals/spec.md).

# Session Specification

**Session ID**: `phase24-session01-source-local-scoring-signals` **Phase**: 24 - Trend Finder Outlier Signal Upgrade **Status**: Not Started **Created**: 2026-06-07

***

## 1. Session Overview

This session adds source-local scoring signals to Trend Finder so a public evidence item can be compared against the stable source entity that produced it when a reviewed source exposes that entity safely. The goal is to reduce popularity bias without replacing the existing six-factor Trend Finder score.

The work adapts the Outlier Lab source memo into Trend Finder-native language: source entities, organic evidence, relative engagement, actionability bands, and bounded scoring support. It also keeps raw source payloads, authors, profiles, Actor internals, transcripts, prompts, provider responses, and private paths out of browser data.

This is the first Phase 24 session because later enrichment, workbench, setup, and export sessions depend on having a browser-safe notion of source-local lift and actionability. It establishes the additive contract that Session 06 can filter/sort by and Session 08 can document across reference surfaces.

***

## 2. Objectives

1. Add additive browser-safe fields for stable source entity identity, relative engagement ratio, baseline availability, and actionability banding.
2. Compute source-local baselines from organic evidence only, with pinned, stickied, promoted, sponsored, or ad-like rows excluded or down-weighted when reviewed sources expose reliable flags.
3. Apply source-local lift as a capped support signal inside existing scoring factors while preserving the six-factor score as authoritative.
4. Surface safe baseline availability and exclusion counts in Engine Replay and evidence UI, backed by focused unit tests and documentation.

***

## 3. Prerequisites

### Required Sessions

* [x] `phase23-session03-non-hermes-parity-documentation-closeout` - Leaves the repository in a validated non-Hermes closeout state before Phase 24 starts.

### Required Tools/Knowledge

* Current Trend Finder schema, scoring, source normalization, source quality, Engine Replay, and reference documentation contracts.
* Phase 24 PRD finding coverage for findings 1, 9, and 11.
* Source compliance documents for any source whose normalizer exposes entity identity or placement flags.
* Outlier Lab references for ratio thresholds, per-entity baselines, and pinned item exclusion as implementation references only.

### Environment Requirements

* Bun 1.3.14 project toolchain from `.bun-version`.
* No live Apify or AI runtime credentials are required for the planned tests; fixture-backed normalizer and scoring tests must cover the session behavior.
* Generated private data, cache artifacts, credentials, logs, and live runtime output remain ignored and outside browser payloads.

***

## 4. Scope

### In Scope (MVP)

* Trend Finder source normalizers can attach stable source entity identity only from reviewed, browser-safe fields - group evidence by source/entity for baseline calculation.
* Trend Finder source normalization can detect reliable pinned, stickied, promoted, sponsored, or ad-like flags - exclude or down-weight those items before baseline and engagement calculation.
* Trend Finder scoring can compute a bounded `entityOutlierRatio` or equivalent relative engagement signal - use it only as capped support inside momentum, evidence strength, or creator-fit calculations.
* Trend Finder evidence and topics can expose actionability band labels - moderate sustained lift is more replicable, while extreme lift is verify-first and not automatically best.
* Trend Finder Engine Replay can report baseline availability, unavailable reasons, and dropped/down-weighted counts - show only bounded counts and safe labels.
* Trend Finder docs can explain the formula, unavailable states, exclusion semantics, score interaction, and actionability guidance.

### Out of Scope (Deferred)

* Replacing the six-factor score with a single outlier ratio - *Reason: Phase 24 requires the existing Trend Finder score to remain authoritative.*
* Adding Instagram/Reels collection or unreviewed source fields - *Reason: new source collection needs separate compliance review.*
* Exposing raw source payloads, authors, profiles, transcripts, prompts, provider responses, Actor internals, Dataset IDs, or private paths - *Reason: browser and static output must stay bounded and sanitized.*
* Delta-aware enrichment, cache pruning, spend accounting, scheduler controls, Signal Workbench, and static export - *Reason: later Phase 24 sessions own those surfaces.*

***

## 5. Technical Approach

### Architecture

Add a script-side source-local signal helper that accepts normalized evidence, safe source metadata, and placement flags, then returns additive browser-safe signal fields plus aggregate trace counts. The helper should calculate baseline state, baseline metric, baseline value, ratio, band, actionability label, and unavailable reason without reading raw source payloads.

Source normalizers should only emit entity and placement metadata that is already allowed by reviewed declarations. The collector should apply the helper after source collection and before AI analysis/scoring so both browser evidence and analyst evidence use the same sanitized contract. Scoring should consume the signal as a capped lift, not as a replacement ranker.

Browser code should parse the new fields through additive Zod defaults, render small evidence chips and Engine Replay summaries, and keep legacy payloads valid. Documentation should describe implemented behavior narrowly and avoid claiming support for sources whose normalizers do not expose stable identity.

### Design Patterns

* Additive schema defaults: Preserve legacy Trend Finder payloads and fixtures.
* Script-only runtime boundary: Keep source parsing, scoring, and trace derivation outside browser-only UI code.
* Typed degradation over throws: Represent missing/low-sample baselines as explicit safe states.
* Bounded score lift: Cap relative lift contribution so the six-factor model remains authoritative.
* Trace summaries over raw details: Emit counts, states, and labels, never raw source records or private provenance.

### Technology Stack

* TypeScript with Bun script runtime.
* Vitest for script, schema, scoring, and view-model tests.
* Zod for browser-safe Trend Finder payload parsing.
* React 19 and existing Tailwind/Radix component conventions for UI chips and Engine Replay surfaces.

***

## 6. Deliverables

### Files to Create

| File                                                                                        | Purpose                                                                                  | Est. Lines |
| ------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | ---------- |
| `scripts/extensions/trend-finder/source-local-signals.ts`                                   | Source-local baseline, ratio, band, placement, and trace helper logic.                   | \~220      |
| `scripts/extensions/trend-finder/__tests__/source-local-signals.test.ts`                    | Focused unit coverage for ratios, low samples, missing baselines, exclusions, and bands. | \~220      |
| `.spec_system/specs/phase24-session01-source-local-scoring-signals/implementation-notes.md` | Implementation notes, command results, and deferred findings.                            | \~80       |
| `.spec_system/specs/phase24-session01-source-local-scoring-signals/security-compliance.md`  | Session security and compliance review notes.                                            | \~80       |

### Files to Modify

| File                                                               | Changes                                                                              | Est. Lines |
| ------------------------------------------------------------------ | ------------------------------------------------------------------------------------ | ---------- |
| `scripts/extensions/trend-finder/normalize.ts`                     | Preserve Hacker News unavailable-baseline behavior without exposing author identity. | \~30       |
| `scripts/extensions/trend-finder/sources/types.ts`                 | Add source-local entity, placement, baseline, and actionability contract types.      | \~90       |
| `scripts/extensions/trend-finder/sources/apify-normalizers.ts`     | Extract reviewed entity identity and placement flags; emit safe signal inputs.       | \~160      |
| `scripts/extensions/trend-finder/collector.ts`                     | Apply source-local signal helper before analysis/scoring and trace aggregate counts. | \~120      |
| `scripts/extensions/trend-finder/engine-trace.ts`                  | Include sanitized source-local trace summaries in generated Engine Replay output.    | \~90       |
| `scripts/lib/ai-runtime/trend-analyst.ts`                          | Add additive source-local fields to analyst evidence input.                          | \~50       |
| `scripts/lib/ai-runtime/scoring.ts`                                | Apply bounded lift to score factors and keep the six-factor score authoritative.     | \~120      |
| `src/extensions/trend-finder/schema.ts`                            | Add Zod schemas/defaults for source-local evidence and topic fields.                 | \~120      |
| `src/extensions/trend-finder/engine-trace.ts`                      | Add browser-safe Engine Trace fields for baseline and exclusion summaries.           | \~90       |
| `src/extensions/trend-finder/view-model.ts`                        | Build ratio, band, and actionability chip models.                                    | \~90       |
| `src/extensions/trend-finder/components/evidence-metric-chips.tsx` | Render source-local signal chips accessibly and without layout shifts.               | \~40       |
| `src/extensions/trend-finder/engine-replay-model.ts`               | Surface baseline availability and exclusion summaries in Engine Replay models.       | \~70       |
| `src/extensions/trend-finder/components/score-breakdown.tsx`       | Show bounded-lift context without changing factor semantics.                         | \~40       |
| `docs/extensions/trend-finder-scoring.md`                          | Document formula, bands, unavailable states, and bounded score interaction.          | \~120      |
| `docs/extensions/trend-finder-sources.md`                          | Document reviewed entity fields and pinned/promoted handling.                        | \~90       |
| `docs/extensions/trend-finder-runtime-and-provenance.md`           | Document Engine Replay baseline and exclusion labels.                                | \~90       |

***

## 7. Success Criteria

### Functional Requirements

* [ ] Evidence from reviewed sources can carry stable entity identity when safe fields are available.
* [ ] Baseline calculation excludes or down-weights reliable pinned, stickied, promoted, sponsored, or ad-like rows before ratio calculation.
* [ ] Missing, low-sample, and unsupported baselines produce explicit safe unavailable states instead of misleading ratios.
* [ ] Source-local lift affects only capped support inside existing score factors and never replaces the six-factor score.
* [ ] Engine Replay shows dropped/down-weighted counts and baseline availability without exposing raw source internals.

### Testing Requirements

* [ ] Unit tests cover ratio calculation, low-sample and missing baselines, pinned/promoted exclusion, actionability bands, and bounded score lift.
* [ ] Schema tests prove legacy Trend Finder payloads still parse.
* [ ] Normalizer tests prove prohibited raw fields, private provenance, authors, profiles, transcripts, Dataset IDs, and Actor internals remain absent.
* [ ] Focused Trend Finder test commands pass or have documented unrelated blockers.

### Non-Functional Requirements

* [ ] New browser-visible fields stay bounded, typed, and additive.
* [ ] Source-local trace summaries use counts, states, and labels only.
* [ ] Payload growth stays within the shared extension payload cap.
* [ ] Documentation distinguishes implemented behavior from planned or source-dependent behavior.

### Quality Gates

* [ ] All files ASCII-encoded.
* [ ] Unix LF line endings.
* [ ] Code follows project conventions.
* [ ] No new dependencies unless required and documented.

***

## 8. Implementation Notes

### Key Considerations

* Treat source entity identity as optional. It is only available where reviewed normalizers expose stable, browser-safe fields.
* Ratio banding is guidance for actionability, not proof that a topic is objectively best.
* Extreme lift should be presented as verify-first or likely one-off rather than as the top recommendation by default.
* Do not use HN author identity for source-local baselines unless a future compliance review changes that boundary.

### Potential Challenges

* Small sample sizes: Mark low-sample baselines unavailable or capped instead of producing noisy ratios.
* Source-specific metric differences: Select one primary engagement metric per source role and document fallback behavior.
* Placement flags are inconsistent: Only trust reviewed fields and record unsupported or unknown placement state explicitly.
* Score drift: Keep lift caps small and test that existing score factors remain bounded.

### Relevant Considerations

* \[P02] **New source adapters need per-source compliance review**: This session extends fields only for reviewed source declarations and safe normalizer outputs.
* \[P06] **Apify actor outputs remain operator-dependent**: Emit safe statuses, labels, and counts rather than raw Actor or Dataset internals.
* \[P02] **Extension payloads and demo labels stay bounded**: New fields must stay capped, defaultable, and explicit about unavailable states.
* \[P15] **Aggregate collection must stay budgeted**: Baseline work should be an in-memory pass over already-collected evidence, not a new unbounded source call.
* \[P05] **Script-only runtime boundary**: Source parsing, scoring, and trace derivation stay in script/shared code; browser code consumes validated summaries.
* \[P00] **Do not document planned features as implemented**: Docs should name source-dependent unavailable states honestly.

### Behavioral Quality Focus

Checklist active: Yes

Top behavioral risks for this session:

* Misleading ratios when baselines are unavailable, too small, or distorted by pinned/promoted items.
* Browser privacy regression through raw source payloads, author/profile fields, Actor internals, transcripts, or private provenance.
* Score behavior regression if source-local lift overpowers the six-factor scoring model.
* UI layout instability from long actionability labels or missing metric values.

***

## 9. Testing Strategy

### Unit Tests

* Test helper ratio calculation, median/baseline selection, low-sample caps, unavailable reasons, organic exclusion, band thresholds, and actionability labels.
* Test scoring with and without source-local signals to prove bounded lift and preserved factor ranges.
* Test schema defaults with legacy payloads and malformed source-local fields.

### Integration Tests

* Extend Apify normalizer tests with fixture-backed entity fields, placement flags, and prohibited private fields.
* Extend collector or Engine Trace tests to verify baseline availability counts and dropped/down-weighted counts are sanitized.

### Manual Testing

* Inspect fixture-backed Trend Finder evidence and Engine Replay data after a focused test run.
* Confirm evidence chips and score context stay readable with present, unavailable, and verify-first band states.

### Edge Cases

* No source exposes stable entity identity.
* A source exposes entity identity but fewer than the minimum sample count.
* All rows for an entity are pinned/promoted and should not form a baseline.
* Extreme ratio is present but should be verify-first and capped.
* Legacy payloads have no source-local fields.

***

## 10. Dependencies

### External Libraries

* None planned.

### Other Sessions

* **Depends on**: `phase23-session03-non-hermes-parity-documentation-closeout`
* **Depended by**: `phase24-session06-signal-workbench-and-local-triage`, `phase24-session08-cross-surface-documentation-and-reference-mode`, and `phase24-session09-end-to-end-validation-release-hardening`

***

## Next Steps

Run the implement workflow step to begin AI-led implementation.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase24-session01-source-local-scoring-signals/spec.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
