> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/adr/0002-trend-finder-embedding-fallback-clustering.md).

# ADR 0002: Trend Finder Embedding Fallback Clustering

**Status**: Accepted, amended **Date**: 2026-06-13 **Amended**: 2026-06-14

## Context

Trend Finder now has an additive theme layer above topics. The Alpha Radar source mapping also suggested embedding-backed fallback clustering, but that would add a model/runtime dependency and potentially increase local install size, browser payload pressure, and operational complexity.

Fallback clustering must continue to work without live AI credentials, without new public source calls, and without exposing private cache paths or raw source payloads to the browser.

## Options Considered

### Option 1: Local Token Similarity

Use normalized topic names, aliases, summaries, creator copy, and Creator Lens focus terms to derive overlap-based theme labels.

* Pros: deterministic, dependency-free, fast, easy to test, and safe for local fallback runs.
* Cons: weaker semantic recall than embeddings for near-synonyms.

### Option 2: Local Embedding Model

Bundle or download a local embedding model and cluster topic vectors.

* Pros: better semantic grouping when topic wording differs.
* Cons: new runtime dependency, model-size risk, cold-start cost, local platform variance, and a larger security/compliance surface.

### Option 3: No Fallback Grouping

Only accept analyst-provided labels and leave fallback topics flat.

* Pros: no new implementation surface.
* Cons: fallback output remains hard to scan and does not satisfy the session goal.

## Decision

Ship Option 1 now: dependency-free local token similarity with Creator Lens focus terms as the preferred fallback signal.

Do not ship a local embedding dependency in this phase. Revisit embeddings only if a future phase explicitly budgets model size, install behavior, offline runtime support, cache invalidation, and compliance review.

### 2026-06-14 Amendment: Feature-Hash Embeddings

Phase 28 Session 11 resolves the investigate-only embedding question without accepting a model dependency. Trend Finder now uses dependency-free feature-hash text vectors for deterministic local ranking and fallback clustering:

* Shared text features normalize bounded strings, remove stopwords, expand a reviewed synonym map, and emit weighted tokens, stems, bigrams, trigrams, and character 4-grams.
* Collector scripts hash features with Node SHA-256 into signed 384-dimension vectors, L2-normalize them, and use cosine similarity for deterministic grouping.
* Browser UI code computes transient vectors with Web Crypto over already-loaded payload text and falls back to lexical ranking when vectors are unavailable, stale, or timed out.
* Generated browser payloads do not store vector arrays, raw source payloads, private paths, prompts, provider responses, or credentials.

This amendment keeps Option 2 rejected. The accepted path is feature hashing, not a local embedding model or vector database.

## Consequences

* Workbench grouping stays deterministic and browser-safe.
* Fallback runs remain useful without live AI credentials.
* No package dependency, source adapter, or source call changes are required.
* Feature-hash similarity can improve fallback grouping and UI ranking without adding an embedding model, package dependency, vector database, or source adapter.
* A future local embedding model still requires a new ADR that explicitly accepts model size, install behavior, offline runtime support, cache invalidation, and compliance review.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/adr/0002-trend-finder-embedding-fallback-clustering.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
