> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase24-session02-delta-aware-enrichment-spend-accounting/spec.md).

# Session Specification

**Session ID**: `phase24-session02-delta-aware-enrichment-spend-accounting` **Phase**: 24 - Trend Finder Outlier Signal Upgrade **Status**: Not Started **Created**: 2026-06-07

***

## 1. Session Overview

This session makes Trend Finder enrichment incremental and accountable. Phase 24 Session 01 established source-local evidence signals; this session builds on that stable evidence identity by adding a private enrichment cache, pruning rules, and browser-safe spend summaries for costly or slow source work.

The work adapts the Outlier Lab cheap-first and preserve-existing-enrichment pattern into Trend Finder-native contracts. The cache is private and keyed by source ID, stable source item ID, adapter version, enrichment type, and a sanitized metadata fingerprint. Browser output receives only bounded counts, states, estimates, and labels.

This is the next session because Session 02 depends on Session 01 evidence ID semantics and must land before later media assets, source setup, scheduler cadence controls, Signal Workbench triage, and static Brief export can safely reuse cache and spend state.

***

## 2. Objectives

1. Add a private enrichment cache contract that preserves safe enrichment summaries across unchanged refreshes.
2. Add cheap-first collector flow so unchanged evidence can skip repeated slow or paid enrichment while new, changed, or high-candidate evidence still runs.
3. Add retention and pruning for enrichment cache entries outside the active evidence window, with saved, skipped, retained, and pruned counts.
4. Add bounded estimated and actual spend summaries per source and per run, including cadence-unavailable projection state until Session 05 exposes scheduler cadence in the UI.

***

## 3. Prerequisites

### Required Sessions

* [x] `phase24-session01-source-local-scoring-signals` - Provides stable evidence IDs, source-local signal semantics, and Engine Replay extension points required for cache keys and trace summaries.

### Required Tools/Knowledge

* Current Trend Finder collector, source adapter, Apify, Google Trends demand, Engine Replay, Sources view, and schema contracts.
* Phase 24 PRD finding coverage for findings 3, 8, and 10.
* Current private cache conventions under `.cache/extensions/trend-finder/`.
* Existing Apify charge caps, `usageTotalUsd` run metadata, collection budget controls, and Google Trends demand charge limits.
* Outlier Lab references for cheap stats pass, enrich-only-new behavior, merge preservation, pruning, and cost framing as implementation references only.

### Environment Requirements

* Bun 1.3.14 project toolchain from `.bun-version`.
* No live Apify or Google Trends credentials are required for planned tests; fixture-backed adapter tests must prove spend and cache behavior.
* Generated private cache files, billing details, raw source payloads, Actor internals, Dataset rows, credentials, and logs remain ignored and out of browser payloads.

***

## 4. Scope

### In Scope (MVP)

* Trend Finder can write and read private enrichment cache entries keyed by stable source item identity, source ID, adapter version, enrichment type, and metadata fingerprint - use atomic writes and schema validation.
* Trend Finder can run cheap metadata collection first, then enrich only new, changed, or selected high-candidate evidence - preserve existing safe summaries for unchanged evidence.
* Trend Finder can prune cache entries outside the active evidence window - record retained, saved, skipped, missed, written, stale, and pruned counts.
* Trend Finder can summarize Apify and Google Trends demand spend - capture actual usage where available, estimate from configured caps otherwise, and label exact versus estimated values clearly.
* Trend Finder can project recurring spend from known cadence when available - show a safe "cadence unavailable" state until Session 05 exposes scheduler cadence.
* Trend Finder Engine Replay and Sources can show cache and spend state - display only browser-safe labels, bounded numbers, source IDs, and statuses.
* Trend Finder docs can describe enrichment cache, pruning, and spend accounting without claiming scheduler UI or media asset behavior has shipped.

### Out of Scope (Deferred)

* Displaying raw billing payloads, account identifiers, tokens, Actor internals, raw Dataset rows, or raw enrichment payloads - *Reason: browser and docs output must remain metric-only and sanitized.*
* Scheduler cadence UI changes - *Reason: Session 05 owns first-run, scheduler, cadence, and live progress controls.*
* Browser-safe media evidence assets or file-serving surfaces - *Reason: Session 03 owns evidence assets, pruning for media stores, and bridge hardening.*
* Adding unreviewed enrichment sources or new public source adapters - *Reason: source expansion requires per-source compliance review.*
* Replacing existing retry, timeout, collection budget, or single-flight behavior - *Reason: Phase 24 PRD marks those capabilities as already handled.*

***

## 5. Technical Approach

### Architecture

Add script-side helpers for enrichment cache and spend accounting. The cache helper should resolve paths below the extension cache directory, validate entry shape with a narrow schema, derive deterministic keys from stable evidence fields, write atomically, and prune by the active evidence keep set. It should store only safe summaries and fingerprints, never raw source items or paid payloads.

The collector should collect cheap browser-safe evidence first, prepare enrichment candidates from stable evidence IDs and source-local metadata, merge cache hits into the run, and write summaries only for cache misses or changed items. Apify and Google Trends demand should contribute source-level spend summaries that the collector aggregates into the Trend Finder payload and Engine Trace.

Browser contracts should be additive. `parseTrendFinderData()` and `parseEngineTrace()` must continue accepting legacy payloads while new cache/spend summaries default to unavailable or zero-count states. Engine Replay and Sources should present the data as operational state, not as raw billing or raw cache detail.

### Design Patterns

* Additive schema defaults: Preserve legacy Trend Finder payloads and fixture data.
* Script-only runtime boundary: Keep cache reads, writes, pruning, and spend normalization outside browser UI.
* Typed degradation over throws: Represent missing usage, unavailable cadence, stale cache, and skipped enrichment as explicit safe states.
* Atomic private writes: Use idempotent cache writes and path containment under the extension cache directory.
* Trace summaries over raw details: Emit counts, source IDs, exact/estimated labels, and safe states only.

### Technology Stack

* TypeScript with Bun script runtime.
* Vitest for cache, adapter, collector, schema, and view-model tests.
* Zod for browser-safe Trend Finder and Engine Trace payload parsing.
* React 19 and existing Tailwind/Radix component conventions for Sources and Engine Replay presentation.

***

## 6. Deliverables

### Files to Create

| File                                                                 | Purpose                                                                                             | Est. Lines |
| -------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | ---------- |
| `scripts/extensions/trend-finder/enrichment-cache.ts`                | Private enrichment cache keys, entry schema, read/write/merge helpers, and pruning summaries.       | \~260      |
| `scripts/extensions/trend-finder/spend-accounting.ts`                | Bounded per-source/per-run spend summaries, exact/estimated labels, and cadence projection helpers. | \~180      |
| `scripts/extensions/trend-finder/__tests__/enrichment-cache.test.ts` | Focused cache key, atomic write, merge, stale entry, and pruning coverage.                          | \~220      |
| `scripts/extensions/trend-finder/__tests__/spend-accounting.test.ts` | Spend helper coverage for actual usage, estimates, caps, unavailable cadence, and redaction.        | \~180      |

### Files to Modify

| File                                                              | Changes                                                                                                     | Est. Lines |
| ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------- | ---------- |
| `scripts/extensions/trend-finder/sources/types.ts`                | Add enrichment cache and spend result contracts for source adapters.                                        | \~90       |
| `scripts/lib/apify/types.ts`                                      | Carry safe usage/cap fields through Apify run results.                                                      | \~60       |
| `scripts/lib/apify/actors.ts`                                     | Map `usageTotalUsd` and max charge caps into safe run summaries.                                            | \~60       |
| `scripts/extensions/trend-finder/sources/apify-adapter.ts`        | Emit per-source spend summaries and cache-aware source metadata.                                            | \~140      |
| `scripts/extensions/trend-finder/sources/google-trends-demand.ts` | Emit Google Trends demand spend estimates and actual usage when available.                                  | \~100      |
| `scripts/extensions/trend-finder/collector.ts`                    | Add cheap-first cache merge, write, prune, and aggregate summary flow.                                      | \~220      |
| `scripts/extensions/trend-finder/engine-trace.ts`                 | Sanitize cache and spend trace events into bounded Engine Trace summaries.                                  | \~120      |
| `src/extensions/trend-finder/schema.ts`                           | Add Trend Finder cache and spend summary schemas/defaults.                                                  | \~130      |
| `src/extensions/trend-finder/engine-trace.ts`                     | Add browser-safe Engine Trace cache and spend fields.                                                       | \~100      |
| `src/extensions/trend-finder/engine-replay-model.ts`              | Project cache and spend summaries into replay metrics and notes.                                            | \~120      |
| `src/extensions/trend-finder/views/sources-view.tsx`              | Display source spend and enrichment cache state in dense source health UI.                                  | \~120      |
| `src/extensions/trend-finder/fixtures.ts`                         | Add safe fixture rows for cache hits, skipped enrichment, pruned entries, and spend estimates.              | \~80       |
| `src/data/live-data.example.json`                                 | Add committed fallback shape for additive cache and spend fields.                                           | \~60       |
| `docs/extensions/trend-finder-pipeline.md`                        | Document enrichment cache, retention, pruning, and spend accounting.                                        | \~120      |
| `docs/extensions/trend-finder-sources.md`                         | Document per-source spend labels and source-level cache summaries.                                          | \~90       |
| `docs/extensions/trend-finder-runtime-and-provenance.md`          | Document Engine Replay labels for skipped, saved, pruned, exact, estimated, and cadence-unavailable states. | \~100      |

***

## 7. Success Criteria

### Functional Requirements

* [ ] Re-running the collector with unchanged source item IDs skips eligible enrichment and reports saved or skipped work.
* [ ] Changed metadata fingerprints cause enrichment cache misses instead of reusing stale summaries.
* [ ] Cache pruning removes entries outside the active evidence keep set without deleting current evidence summaries.
* [ ] Apify and Google Trends demand spend summaries show exact usage when available and estimates from caps when exact charges are unavailable.
* [ ] Engine Replay and Sources show cache/spend state without raw billing payloads, account IDs, tokens, raw Dataset rows, raw enrichment payloads, Actor internals, or private cache paths.

### Testing Requirements

* [ ] Unit tests cover cache keys, schema validation, hit/miss/skip/write counts, stale fingerprints, path containment, atomic writes, and pruning.
* [ ] Adapter tests cover Apify actual usage, max-charge estimates, Google Trends demand charge limits, redaction, and missing-token fallbacks.
* [ ] Schema tests prove legacy Trend Finder and Engine Trace payloads still parse with default unavailable/zero-count summaries.
* [ ] View-model and UI tests cover exact, estimated, cadence-unavailable, empty, degraded, and offline states.

### Non-Functional Requirements

* [ ] Browser-visible cache and spend fields stay bounded, typed, additive, and deterministic.
* [ ] Private cache files stay under ignored `.cache/extensions/trend-finder/` conventions.
* [ ] Payload growth stays within the shared Trend Finder extension payload cap.
* [ ] Documentation distinguishes implemented cache/spend behavior from deferred scheduler cadence UI and media asset behavior.

### Quality Gates

* [ ] All files ASCII-encoded.
* [ ] Unix LF line endings.
* [ ] Code follows project conventions.
* [ ] No new dependencies unless required and documented.

***

## 8. Implementation Notes

### Key Considerations

* Treat stable evidence IDs as required for cache reuse. Evidence without a stable ID should be marked ineligible rather than cached under a weak key.
* Keep cache entries private. Browser payloads should receive only aggregate counts, source IDs, exact/estimated labels, and safe states.
* Do not expose Apify Actor IDs, run IDs, Dataset IDs, raw billing payloads, or account identifiers in browser summaries.
* Cadence projection should remain explicit about unavailable cadence until Session 05 surfaces scheduler cadence state.

### Potential Challenges

* Cache poisoning from unstable IDs: Require source ID, stable item ID, adapter version, enrichment type, and metadata fingerprint in each key.
* Stale summaries after source normalizer changes: Include adapter version or schema version in cache entries and treat mismatches as misses.
* Spend precision differences between sources: Label actual, estimated, capped, and unavailable states separately instead of implying false precision.
* Cache pruning deleting active evidence: Build the keep set from accepted evidence IDs after source collection and test retained current entries.

### Relevant Considerations

* \[P02] **New source adapters need per-source compliance review**: This session adds no new source adapters and does not widen source collection.
* \[P06] **Apify actor outputs remain operator-dependent**: Source states must remain degraded or blocked with explicit warnings when configured Actors are missing, placeholder, or unavailable.
* \[P15] **Aggregate collection must stay budgeted**: Spend visibility must complement existing budget caps without removing collection budget guards.
* \[P02] **Extension payloads and demo labels stay bounded**: Cache/spend data must use compact summaries and preserve payload limits.
* \[P05] **Script-only runtime boundary**: Cache, billing, and enrichment logic remain script-side and private.
* \[P00] **Do not document planned features as implemented**: Docs should not claim scheduler cadence UI, media assets, or static export behavior is implemented by this session.

### Behavioral Quality Focus

Checklist active: Yes Top behavioral risks for this session:

* Stale cache entries being reused for changed evidence.
* Browser output leaking raw billing, Actor, Dataset, cache path, or raw enrichment details.
* Spend estimates being presented as exact charges.
* Cache writes or pruning leaving partial files after failures.

***

## 9. Testing Strategy

### Unit Tests

* Test enrichment cache key derivation, schema validation, safe path containment, atomic writes, cache hits, cache misses, stale fingerprints, merge behavior, and pruning counts.
* Test spend accounting helper behavior for actual usage, estimated caps, unavailable exact usage, zero-cost public sources, cadence-unavailable state, and monthly projection when cadence is supplied.

### Integration Tests

* Extend Apify adapter tests for `usageTotalUsd`, `maxTotalChargeUsd`, missing token fallbacks, exhausted budgets, and redaction.
* Extend Google Trends demand tests for configured cap estimates and actual usage when the Actor run returns charge metadata.
* Extend collector tests to prove unchanged evidence can reuse cached summaries and changed fingerprints produce misses.

### Manual Testing

* Run focused Trend Finder tests and inspect generated fixture output for cache/spend labels.
* Verify no browser payload contains raw billing payloads, Actor internals, Dataset IDs, private cache paths, tokens, or raw enrichment data.

### Edge Cases

* Empty source collection with no evidence.
* Evidence with missing or unstable IDs.
* Cache directory missing, unreadable, or containing malformed entries.
* Pruning with all current evidence retained and stale entries removed.
* Actual spend unavailable while caps are configured.
* Cadence unavailable until scheduler UI work lands.

***

## 10. Dependencies

### External Libraries

* None expected.

### Other Sessions

* **Depends on**: `phase24-session01-source-local-scoring-signals`
* **Depended by**: `phase24-session03-browser-safe-evidence-assets-file-hardening`, `phase24-session05-scheduler-first-run-live-progress-controls`, `phase24-session06-signal-workbench-local-triage`, `phase24-session07-static-brief-export`, `phase24-session08-cross-surface-documentation-reference-mode`, `phase24-session09-end-to-end-validation-release-hardening`

***

## Next Steps

Run the implement workflow step to begin AI-led implementation.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase24-session02-delta-aware-enrichment-spend-accounting/spec.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
