> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase29-session10-seed-candidate-review-artifact/spec.md).

# Session Specification

**Session ID**: `phase29-session10-seed-candidate-review-artifact` **Phase**: 29 - Trend Finder TrendingAI Comparison Adoption **Status**: Not Started **Created**: 2026-06-21

***

## 1. Session Overview

This session adds a private seed-candidate review artifact for Trend Finder. The goal is to borrow the useful review workflow from TrendingAI without adopting its X/reply dependency, widening collection, or mutating canonical seeds automatically.

The implementation should inspect existing reviewed evidence and topic identity output to find rows that are high quality but weakly clustered, repeatedly resolve to new canonical IDs, or look like source-local outliers with weak topic identity. Those signals should become private review rows that include a proposed alias or keyword, source evidence IDs, reason codes, collision checks against reviewed canonical seeds, and `manual_review_required`.

The artifact stays in private extension storage. Browser payloads, static Brief output, public docs, keyword packs, and canonical topic seed files must not receive generated seed rows. Operators can review the private artifact later, but this session does not create an auto-approval or seed mutation path.

***

## 2. Objectives

1. Generate private seed-candidate rows from existing reviewed evidence only.
2. Include proposed alias or keyword, source evidence IDs, reason codes, collision checks, and `manual_review_required` on every row.
3. Reuse existing canonical seed, topic identity, normalization, source-local, and private diagnostics patterns.
4. Prove that canonical seeds, keyword packs, collectors, browser payloads, and static exports are not mutated by candidate generation.
5. Add focused tests for unknown canonical IDs, high-quality unclustered rows, source-local outliers, collisions, privacy, and disabled/no-signal cases.

***

## 3. Prerequisites

### Required Sessions

* [x] `phase28-session01-cross-source-signal-identity-and-dedup` - Provides canonical identity and deduplication foundations.
* [x] `phase28-session13-keyword-packs-rotation-and-coverage` - Provides reviewed keyword pack taxonomy and canonical keyword context.
* [x] `phase28-session15-documentation-validation-and-release` - Confirms Phase 28 canonical seed, topic identity, and normalization paths are green.

### Required Tools/Knowledge

* Bun 1.3.14 workflow through `bun run test`.
* Existing canonical seed contract in `scripts/lib/ai-runtime/canonical-topic-seeds.ts`.
* Existing topic identity resolver in `scripts/lib/ai-runtime/topic-identity.ts`.
* Existing Trend Finder normalization and topic projection flow in `scripts/extensions/trend-finder/normalize.ts` and `scripts/extensions/trend-finder/topics.ts`.
* Existing private diagnostics writer and browser-safe manifest pattern in `scripts/extensions/trend-finder/private-diagnostics.ts`.

### Environment Requirements

* Project dependencies installed with Bun.
* Local Trend Finder cache directory available to the collector.
* No new source, credential, media, database, hosted storage, dependency, or third-party transfer approval is required.

***

## 4. Scope

### In Scope (MVP)

* Trend Finder can generate a private seed-candidate artifact - Build a script-only generator over existing reviewed evidence and resolved topics.
* Candidate rows explain their origin - Include proposed alias or keyword, cited source evidence IDs, reason codes, collision checks, and `manual_review_required`.
* Weak identity cases become reviewable - Detect unknown canonical IDs, high-quality unclustered rows, and source-local outliers with weak topic identity.
* Collision checks are explicit - Compare proposed aliases and keywords against reviewed canonical seeds and reviewed keywords before writing rows.
* Private storage stays private - Write rows only to private extension storage and expose at most a browser-safe manifest entry.
* Canonical sources do not mutate - Prove no automatic edits to canonical seeds, keyword packs, source declarations, browser payloads, or static Brief output.
* Release checks prove behavior - Add unit, integration, privacy, payload, and ASCII validation coverage.

### Out of Scope (Deferred)

* Widening collection or adding X/reply data - Reason: this session adopts the review workflow only, not the source dependency.
* Editing keyword packs or canonical seeds automatically - Reason: every row is manual-review-only.
* Publishing seed candidates in browser-safe Trend Finder surfaces - Reason: the artifact is private operator review material.
* Source setup, industry events, security lens, Brief archival, One-to-Watch, pre-run estimates, or podcast work - Reason: later Phase 29 sessions own those areas.
* Approval, merge, or import UI for candidates - Reason: this session only produces private review artifacts.

***

## 5. Technical Approach

### Architecture

Create a focused `seed-candidates` helper under `scripts/extensions/trend-finder/`. The helper should own candidate input contracts, reason-code derivation, proposed alias/keyword normalization, collision checks against reviewed canonical seeds and reviewed keywords, deterministic ordering, row bounds, and privacy-safe serialization.

The generator should consume already-normalized evidence plus topic identity output or projected topics. It should never call collectors, add source adapters, edit source config, or write canonical seed files. Candidate generation should run after topic identity and source-local evidence enrichment have produced enough context to identify weak identity rows, and before private diagnostics are finalized.

Private storage should reuse the existing private diagnostics pattern when possible. Add a new artifact entry for `seed-candidates` so the browser payload can show only a safe manifest count/status without row contents, private file names, private paths, raw diagnostics, or candidate details. If the private write fails, the run should continue with a safe warning and without leaking the attempted path.

Candidate rows should be bounded and deterministic. Each row should include a stable candidate ID, proposed alias or keyword, evidence IDs, source IDs, reason codes, collision result, confidence or priority band, and `manual_review_required: true`. Rows without evidence IDs, rows with unbounded text, rows that collide exactly with an existing canonical seed, and rows built from unsupported or unreviewed source material should be dropped or marked as collisions, not auto-applied.

### Design Patterns

* Script-only private state: Candidate rows stay under `scripts/extensions/trend-finder/` private artifact handling.
* Reviewed-source boundary: The helper reads existing reviewed evidence only and does not start collection.
* Additive manifest defaults: New private artifact IDs are schema-defaulted and backwards compatible.
* Deterministic derivation: Sort rows by reason priority, evidence support, proposed label, and candidate ID.
* Manual-review gate: Every generated row carries `manual_review_required: true`.
* Safe failure mode: Missing context, write failures, or no candidate rows produce safe warnings or skipped artifacts, not browser-visible details.

### Technology Stack

* TypeScript for generator contracts, schema updates, and collector/private diagnostics integration.
* Zod schemas with `.default()` for backwards-compatible private diagnostics manifest parsing.
* Bun and Vitest for helper, identity, diagnostics, collector, and privacy tests.
* Existing payload-size and private-artifact scans for browser boundary validation.

***

## 6. Deliverables

### Files to Create

| File                                                                | Purpose                                                                                                 | Est. Lines |
| ------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | ---------- |
| `scripts/extensions/trend-finder/seed-candidates.ts`                | Private seed-candidate derivation, collision checks, bounds, deterministic sorting, and row projection. | \~280      |
| `scripts/extensions/trend-finder/__tests__/seed-candidates.test.ts` | Unit tests for unknown IDs, unclustered rows, source-local outliers, collisions, ordering, and privacy. | \~300      |

### Files to Modify

| File                                                                    | Changes                                                                                                             | Est. Lines |
| ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- | ---------- |
| `scripts/extensions/trend-finder/collector.ts`                          | Run seed-candidate generation after reviewed evidence/topic identity context exists and before diagnostics handoff. | \~80       |
| `scripts/extensions/trend-finder/private-diagnostics.ts`                | Write the private `seed-candidates` artifact and include only safe manifest metadata in browser payloads.           | \~120      |
| `src/extensions/trend-finder/schema.ts`                                 | Add defaulted private diagnostics artifact ID/kind entries for seed candidates.                                     | \~30       |
| `scripts/lib/ai-runtime/topic-identity.ts`                              | Export or expose the minimal weak-identity signals needed by the generator without changing resolver behavior.      | \~40       |
| `scripts/lib/ai-runtime/canonical-topic-seeds.ts`                       | Export normalized lookup helpers for collision checks without adding generated seeds.                               | \~40       |
| `scripts/extensions/trend-finder/topics.ts`                             | Pass resolved identity/evidence context needed for candidate derivation without changing public topic output.       | \~50       |
| `scripts/extensions/trend-finder/normalize.ts`                          | Preserve reviewed evidence attributes needed by candidate scoring without adding new collection fields.             | \~40       |
| `scripts/extensions/trend-finder/sources/keyword-packs.ts`              | Expose read-only reviewed keyword lookup for collision checks without enabling writes.                              | \~30       |
| `scripts/extensions/trend-finder/__tests__/private-diagnostics.test.ts` | Cover private seed-candidate artifact writing, skipped states, manifest safety, and path omission.                  | \~90       |
| `scripts/extensions/trend-finder/__tests__/collector.test.ts`           | Cover collector integration, no collection widening, no canonical mutation, and safe write-failure handling.        | \~100      |
| `scripts/lib/ai-runtime/__tests__/topic-identity.test.ts`               | Cover any exported weak-identity helper behavior without changing existing identity resolution.                     | \~60       |
| `scripts/extensions/trend-finder/__tests__/normalize.test.ts`           | Cover preservation of reviewed evidence attributes used for candidate generation.                                   | \~50       |
| `scripts/extensions/trend-finder/measure-payload-size.ts`               | Keep payload budget checks aware that row contents remain private and only manifest metadata is browser-safe.       | \~20       |

***

## 7. Success Criteria

### Functional Requirements

* [ ] High-quality unclustered fixture evidence produces private seed-candidate rows with evidence IDs and collision checks.
* [ ] Repeated weak identity or new canonical ID fixtures produce reviewable candidates without mutating canonical seeds.
* [ ] Source-local outlier fixtures with weak topic identity produce bounded review rows.
* [ ] Rows that exactly collide with reviewed canonical aliases or keywords are marked as collisions or suppressed according to the helper contract.
* [ ] Every row includes `manual_review_required: true`.
* [ ] The collector does not start new collection, add new sources, edit keyword packs, or write canonical topic seed files.
* [ ] Browser payloads and static Brief output expose no seed-candidate row contents, proposed aliases, private paths, raw diagnostics, prompts, provider responses, tokens, or credential-shaped strings.

### Testing Requirements

* [ ] Unit tests written and passing for candidate derivation, collision checks, row bounds, deterministic ordering, no-signal behavior, and privacy-safe serialization.
* [ ] Private diagnostics tests prove the artifact is written privately and only safe manifest metadata reaches browser payloads.
* [ ] Collector tests prove no automatic collection or keyword/canonical mutation occurs.
* [ ] Identity and normalization tests prove existing behavior is unchanged while required context remains available.
* [ ] Payload-size and private-artifact checks remain green.

### Non-Functional Requirements

* [ ] Browser payload stays under the 1 MB extension budget.
* [ ] Candidate generation is deterministic and bounded by configured row/evidence limits.
* [ ] Private writes use safe path segments, atomic write behavior, and cleanup on write failure.
* [ ] No new source, media, credential flow, database, hosted storage, dependency, or third-party transfer is introduced.

### Quality Gates

* [ ] All files ASCII-encoded.
* [ ] Unix LF line endings.
* [ ] Code follows project conventions.
* [ ] New schema branches use explicit fallback/default behavior.

***

## 8. Implementation Notes

### Key Considerations

* The generator should not infer approval from repeated evidence. It only prepares private review rows.
* Candidate rows should cite existing evidence IDs; rows without evidence support are dropped.
* Collision checks should compare normalized proposed aliases and keywords against reviewed canonical seed labels, aliases, keyword seeds, and reviewed keyword packs.
* Do not add generated rows to `TREND_FINDER_REVIEWED_CANONICAL_TOPIC_SEEDS`, `TREND_FINDER_REVIEWED_CANONICAL_KEYWORDS`, or keyword category packs.
* The private diagnostics manifest can expose record count, byte count, status, and warning code, but not candidate row contents, private file names, or private paths.
* The helper should be useful even when no AI runtime is available; it is deterministic over existing normalized evidence and identity context.

### Potential Challenges

* Weak identity ambiguity: Keep reason codes explicit and prefer conservative candidate suppression over noisy private output.
* Collision false positives: Normalize strings consistently with canonical seed and keyword pack helpers, then record collision details privately.
* Collector integration timing: Run after enough identity/source-local context exists, but before browser slimming and private diagnostics manifest creation.
* Legacy manifest compatibility: Add schema defaults so older generated data parses with no seed-candidate artifact.

### Relevant Considerations

* \[P02] **Extension payloads and labels stay bounded**: The browser payload may include safe manifest metadata only, never candidate row contents.
* \[P24] **Browser-safe export and triage boundaries**: Static Brief, browser state, and public artifacts must not expose private seed-candidate rows.
* \[P27] **Do not add candidate sources by implication**: Seed candidates are review artifacts, not source approvals or collection expansion.
* \[P28] **Direct public source scope is narrow**: This session adds no source and must use only existing reviewed evidence.
* \[P28] **Deferred source candidates remain gated**: Candidate rows must not imply Semantic Scholar, Bluesky, Replicate, newsletters, X/Twitter, Digg, or similar sources are approved.
* \[P00] **Do not document planned features as implemented**: Any docs touched during implementation must describe only shipped behavior.

### Behavioral Quality Focus

Checklist active: Yes Top behavioral risks for this session:

* Candidate generation accidentally mutates canonical seeds or keyword packs.
* Private seed-candidate rows leak into browser payloads, static Brief output, logs, or public docs.
* Weak identity heuristics create noisy or collision-prone rows without clear manual-review context.

***

## 9. Testing Strategy

### Unit Tests

* Test unknown canonical ID and weak identity fixtures produce candidates with evidence IDs and `manual_review_required`.
* Test high-quality unclustered evidence and source-local outlier fixtures produce bounded candidates.
* Test exact and near collision behavior against reviewed canonical seeds and reviewed keyword packs.
* Test deterministic ordering and max-row/max-evidence bounds.
* Test no-signal and unsupported-source cases produce no rows or skipped artifacts.
* Test privacy-safe serialization excludes private paths, prompts, provider responses, tokens, credential-shaped strings, and generated candidate text from browser payloads.

### Integration Tests

* Test collector integration with mocked reviewed evidence, resolved topic identities, cache directory, and private diagnostics dependencies.
* Test private diagnostics manifest updates with written, skipped, and failed seed-candidate artifact states.
* Test schema parsing for legacy private diagnostics manifests without seed-candidate entries.
* Test payload-size and private-artifact scan behavior after manifest-only browser exposure.

### Manual Testing

* Run the focused Vitest suites for seed candidates, private diagnostics, collector integration, topic identity, and normalization.
* Inspect generated private artifact test output to confirm rows are useful for manual review while browser-safe manifest output stays minimal.

### Edge Cases

* No reviewed evidence available.
* Proposed alias normalizes to an existing canonical label, alias, keyword seed, or reviewed keyword.
* Multiple evidence rows suggest the same alias or keyword.
* Evidence has missing source ID, missing topic ID, low relevance, unsupported source role, or weak quality.
* Private diagnostics write fails after the temp file is created.

***

## 10. Dependencies

### External Libraries

* No new external libraries.

### Other Sessions

* **Depends on**: Phase 28 canonical seed, topic identity, normalization, and keyword-pack paths.
* **Depended by**: `phase29-session18-documentation-validation-and-release`

***

## Next Steps

Run the implement workflow step to begin AI-led implementation.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase29-session10-seed-candidate-review-artifact/spec.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
