> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/testing.md).

# Testing

## Overview

The project uses [Vitest](https://vitest.dev/) as the test runner with [`@vitest/coverage-v8`](https://vitest.dev/guide/coverage) for coverage. Tests live colocated with source modules in `__tests__/` directories.

## Commands

| Command                 | Description                                         |
| ----------------------- | --------------------------------------------------- |
| `bun run test`          | Run all tests once                                  |
| `bun run test:watch`    | Run tests in watch mode (re-run on save)            |
| `bun run test:coverage` | Run tests with comprehensive coverage report + gate |
| `bun run test:e2e`      | Run Playwright browser regressions                  |
| `bun test`              | Run only the Bun-native runner guard                |

## Configuration

* **Config file**: `vitest.config.ts`
* **Environment**: Node by default; individual test files opt into `happy-dom` with `// @vitest-environment happy-dom`
* **Path aliases**: Resolved via Vite's native `resolve.tsconfigPaths` setting (same `@/*` alias as the app)
* **Test glob**: `src/**/__tests__/**/*.test.{ts,tsx}` and `scripts/lib/__tests__/**/*.test.{ts,tsx}` and `scripts/lib/**/__tests__/**/*.test.{ts,tsx}` and `scripts/extensions/**/__tests__/**/*.test.{ts,tsx}`
* **Mocks**: `restoreMocks: true` -- mocks are auto-restored between tests

Plain `bun test` uses Bun's native runner, not Vitest. `bunfig.toml` scopes that command to `tests/bun/`, which contains a guard confirming the real test suite remains routed through the package scripts.

Browser-level regression tests use Playwright:

* **Config file**: `playwright.config.ts`
* **Test directory**: `tests/e2e/`
* **Browser**: Chromium
* **Dev server**: Playwright starts Vite through `scripts/playwright-webserver.sh` on `http://127.0.0.1:5189` by default
* **Artifacts**: traces, screenshots, and videos are retained only on failure

Playwright accepts `PLAYWRIGHT_PORT`, `PLAYWRIGHT_BASE_URL`, `PLAYWRIGHT_REUSE_EXISTING_SERVER`, and `PLAYWRIGHT_WEB_SERVER_LOG` for local debugging.

Playwright browser binaries are stored under the active user home. If you run e2e certification with a brand-new isolated `HOME`, install Chromium in that home before the first e2e command:

```bash
bunx playwright install chromium
```

## AI Rogue Input-Mode Coverage

AI Rogue controls use a raw browser-local preference of Auto, Keyboard, or Compact. Auto resolves at the mounted browser boundary: coarse pointer with no hover uses Compact controls, and desktop/fine-pointer or unavailable capabilities use Keyboard controls. Runtime APIs stay concrete `keyboard | compact`.

Focused coverage for this contract:

```bash
bun run test -- src/extensions/ai-rogue
bun run test:e2e -- tests/e2e/ai-rogue-mobile.spec.ts tests/e2e/ai-rogue-runtime.spec.ts tests/e2e/pages-demo-mobile.spec.ts --project=chromium --project=pages-demo-chromium
```

The browser coverage asserts fresh mobile Auto Start plus movement, fresh desktop Auto keyboard-first movement, explicit Keyboard/Compact overrides, concrete runtime payloads, and public-demo AI Rogue gameplay with no `/__*` local bridge requests.

## AI Rogue Phase 39 Level-Authoring Closeout

Use this cluster when validating AI Rogue level-authoring sessions that touch the authored level registry, generated world behavior, save contracts, boss or finale routing, audio routing, content-owned assets, public-demo wording, or release documentation.

Focused registry, world, simulation, and asset suites:

```bash
bunx vitest run src/extensions/ai-rogue/runtime/content/__tests__/levels.test.ts src/extensions/ai-rogue/runtime/__tests__/content-baseline.test.ts src/extensions/ai-rogue/runtime/__tests__/world.test.ts src/extensions/ai-rogue/runtime/__tests__/simulation.test.ts src/extensions/ai-rogue/runtime/__tests__/golden-determinism.test.ts src/extensions/ai-rogue/runtime/__tests__/assets.test.ts
```

Focused save, boss, render, audio, and combat suites:

```bash
bunx vitest run src/extensions/ai-rogue/__tests__/save-schema.test.ts src/extensions/ai-rogue/__tests__/save-schema-parity.test.ts src/extensions/ai-rogue/runtime/__tests__/boss-presentation.test.ts src/extensions/ai-rogue/runtime/__tests__/render-model.test.ts src/extensions/ai-rogue/runtime/__tests__/renderer-audio-adapter.test.ts src/extensions/ai-rogue/runtime/__tests__/audio.test.ts src/extensions/ai-rogue/runtime/__tests__/combat.test.ts
```

Project quality gates:

```bash
bun run typecheck
bun run typecheck:scripts
bun run lint
bun run test
bun run build
bun run budget:check
bun run runtime:check-private
bash scripts/check-asset-sizes.sh
bun audit --audit-level high
```

Browser proof for AI Rogue desktop/mobile and public-demo mobile smoke:

```bash
bun run test:e2e -- tests/e2e/ai-rogue-runtime.spec.ts tests/e2e/ai-rogue-mobile.spec.ts --project=chromium
bun run test:e2e -- tests/e2e/pages-demo-mobile.spec.ts --project=pages-demo-chromium
```

Closeout review should confirm:

* `AI_ROGUE_SAVE_SCHEMA_VERSION` remains 1 unless a real persisted-shape migration landed with parser and parity tests.
* Saves persist bounded runtime state, IDs, depth/max depth, metadata, and safe labels; whole level specs stay in `runtime/content/`.
* No remote content loading, hosted writes, collectors, analytics, raw prompts, transcripts, command bodies, local paths, logs, credentials, private telemetry, worker migration, broad inventory rewrite, or map-editor dependency landed.
* Any reused media has an explicit rationale in the level spec, docs, or tests; any new media has provenance, asset-size evidence, and focused asset/audio tests.

## Phase 21-23 Non-Hermes Closeout

Use this cluster when validating the completed non-Hermes v2.3 work: pricing and daily activity accuracy, project windows, saved-time defaults, Claude OAuth usage, Antigravity detection and consumers, Dream source rows, Claude Code route coverage, skill-card sheen, and sticky sidebar behavior.

Focused Vitest command:

```bash
bun run test -- scripts/lib/__tests__/session-scanner.test.ts scripts/lib/__tests__/aggregate-orchestration.test.ts scripts/lib/__tests__/usage-assembly.test.ts scripts/lib/__tests__/claude-oauth-usage.test.ts scripts/lib/__tests__/ai-runtime-providers.test.ts scripts/lib/__tests__/app-detection.test.ts src/lib/__tests__/nested-validation.test.ts src/lib/__tests__/validate-live-data.test.ts src/lib/__tests__/antigravity-live-data.test.ts src/lib/__tests__/home-transforms.test.ts src/lib/__tests__/route-transforms.test.ts src/lib/__tests__/time-saved.test.ts src/lib/__tests__/transforms.test.ts src/components/__tests__/usage-panel.test.tsx src/components/__tests__/app-sidebar.test.tsx src/components/home/__tests__/subscription-strip.test.tsx src/components/home/__tests__/dream-sources-strip.test.tsx src/components/home/__tests__/skill-card.test.tsx src/routes/__tests__/setup-modal.test.tsx src/routes/__tests__/agents.test.tsx src/routes/__tests__/route-tree.test.ts src/routes/__tests__/skills.test.tsx src/components/hermes/__tests__/hermes-mission-control.test.tsx src/components/ui/__tests__/sidebar.test.tsx
```

Full quality gates:

```bash
bun run typecheck
bun run typecheck:scripts
bun run lint
bun run format:check
bun run test
bun run build
bun run budget:check
bun run runtime:check-private
```

Available browser smoke checks:

```bash
bun run test:e2e -- tests/e2e/home-dashboard.spec.ts tests/e2e/skills-page.spec.ts tests/e2e/hermes-agent.spec.ts tests/e2e/claude-code-agent.spec.ts tests/e2e/openclaw-agent.spec.ts
```

Claude Code route coverage now has a dedicated Playwright spec at `tests/e2e/claude-code-agent.spec.ts`, backed by focused route/unit coverage through `src/routes/__tests__/agents.test.tsx`, `src/components/hermes/__tests__/hermes-mission-control.test.tsx`, and `src/routes/__tests__/route-tree.test.ts`. The browser spec asserts Mission Control assembly through the shared Hermes bridge contract, focused goal and human-briefing drawer behavior, responsive wrapping, and the absence of any Claude execution, shell, or workspace calls.

Closeout docs checks:

```bash
rg -n "v2[_]3-port-remaining|ongoing-projects/v2[_]3" docs README.md
git diff --check
```

The old v2.3 ongoing-project backlog file is intentionally absent. Public docs should not depend on it; spec-system PRD and session artifacts preserve the source anchors and no-action decisions.

## Phase 06 Runtime Certification

Trend Finder runtime changes should run focused checks before the full suite. These checks are credential-free and use mocks, fixtures, or explicit missing credential states.

| Area                                                         | Command                                                                                                                                                                                                                  |
| ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Apify config/client/Actor/Dataset                            | `bun run test -- scripts/lib/apify/__tests__`                                                                                                                                                                            |
| AI runtime providers, analyst validation, scoring, snapshots | `bun run test -- scripts/lib/ai-runtime/__tests__`                                                                                                                                                                       |
| Trend Finder collector and source adapters                   | `bun run test -- scripts/extensions/trend-finder/__tests__ scripts/extensions/trend-finder/sources/__tests__`                                                                                                            |
| Dashboard schema and runtime views                           | `bun run test -- src/lib/__tests__/trend-finder-schema.test.ts src/lib/__tests__/trend-finder-dashboard.test.tsx src/lib/__tests__/trend-finder-engine-replay.test.tsx src/lib/__tests__/trend-finder-collector.test.ts` |
| Browser-visible Trend Finder flow                            | `bun run test:e2e -- tests/e2e/trend-finder.spec.ts tests/e2e/trend-finder-engine-replay.spec.ts`                                                                                                                        |
| Private artifact metadata                                    | `bun run runtime:check-private`                                                                                                                                                                                          |

The combined focused certification command is:

```bash
bun run test -- scripts/lib/apify/__tests__ scripts/lib/ai-runtime/__tests__ scripts/extensions/trend-finder/__tests__ scripts/extensions/trend-finder/sources/__tests__ src/lib/__tests__/trend-finder-schema.test.ts src/lib/__tests__/trend-finder-dashboard.test.tsx src/lib/__tests__/trend-finder-engine-replay.test.tsx src/lib/__tests__/trend-finder-collector.test.ts
```

After focused checks pass, run:

```bash
bun run typecheck
bun run typecheck:scripts
bun run test
bun run test:e2e -- tests/e2e/trend-finder.spec.ts tests/e2e/trend-finder-engine-replay.spec.ts
bun run lint
bun run format:check
bun run build
bun run budget:check
bun run runtime:check-private
```

Fallback certification should cover:

* Missing `APIFY_TOKEN` readiness with `secretValuesPrinted: false`.
* Missing and invalid OpenAI account auth status with redacted output.
* Disabled AI provider behavior.
* Invalid AI analyst output retry and deterministic fallback.
* One failed source preserving evidence from healthy sources.
* Dashboard runtime readiness, source warnings, score breakdowns, evidence links, Watchlist navigation, provenance labels, source health, and Brief navigation.
* Private runtime artifacts remaining ignored and untracked.

## Phase 11 Scheduled Aggregate Certification

Scheduled aggregate changes should validate the scheduler boundary, private artifact metadata, and implemented-versus-planned documentation. These checks are credential-free unless the operator has optional provider or source keys in the local shell or `.env.local`.

| Area                            | Command                                                                                                                                                                                                                                                                                                         | Expected evidence                                                                                                            |
| ------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| Scheduler registry and handlers | `bun run test -- scripts/lib/__tests__/scheduler-registry.test.ts scripts/lib/__tests__/scheduler-aggregate-handler.test.ts scripts/lib/__tests__/scheduler-agent-aggregate-handler.test.ts scripts/lib/__tests__/scheduler-trend-finder-handler.test.ts scripts/lib/__tests__/scheduler-dream-handler.test.ts` | Aggregate, agent aggregate, Trend Finder, and Dream jobs are available and delegate through their scoped handlers            |
| Runner, state, locks, and logs  | `bun run test -- scripts/lib/__tests__/scheduler-runner.test.ts scripts/lib/__tests__/scheduler-run-state.test.ts scripts/lib/__tests__/scheduler-run-log.test.ts scripts/lib/__tests__/scheduler-locks.test.ts`                                                                                                | Job IDs, timeout, blocked runs, sanitized state, private logs, lock contention, and stale-lock recovery are covered          |
| Operator status surface         | `bun run test -- scripts/lib/__tests__/scheduler-operator-status.test.ts scripts/lib/__tests__/scheduler-status-cli.test.ts`                                                                                                                                                                                    | Status output is safe, command-only, and exposes no raw private diagnostics                                                  |
| Script type safety              | `bun run typecheck:scripts`                                                                                                                                                                                                                                                                                     | Script TypeScript project passes                                                                                             |
| Full regression suite           | `bun run test`                                                                                                                                                                                                                                                                                                  | Repository Vitest suite passes, or unrelated pre-existing failures are recorded with evidence                                |
| CLI smoke                       | `bun run scheduler:status`; `bun run scheduler:agents:status`; `bun run scheduler:trend-finder:status`; `bun run scheduler:dream:status`                                                                                                                                                                        | Safe status prints first-run or latest-run labels, command hints, private diagnostic labels, and current Dream boundary copy |
| Private artifact metadata       | `bun run runtime:check-private`                                                                                                                                                                                                                                                                                 | Generated private data, auth, cache, logs, coverage, test results, and browser reports are ignored and untracked             |

The combined focused scheduler command is:

```bash
bun run test -- scripts/lib/__tests__/scheduler-registry.test.ts scripts/lib/__tests__/scheduler-aggregate-handler.test.ts scripts/lib/__tests__/scheduler-agent-aggregate-handler.test.ts scripts/lib/__tests__/scheduler-trend-finder-handler.test.ts scripts/lib/__tests__/scheduler-dream-handler.test.ts scripts/lib/__tests__/scheduler-runner.test.ts scripts/lib/__tests__/scheduler-run-state.test.ts scripts/lib/__tests__/scheduler-run-log.test.ts scripts/lib/__tests__/scheduler-locks.test.ts scripts/lib/__tests__/scheduler-operator-status.test.ts scripts/lib/__tests__/scheduler-status-cli.test.ts scripts/lib/__tests__/aggregate-live-data-write.test.ts scripts/lib/__tests__/aggregate-orchestration.test.ts scripts/lib/__tests__/dream-execution.test.ts scripts/lib/__tests__/extensions-runner.test.ts
```

Phase 11 documentation review should confirm:

* Scheduled aggregate enablement, cadence, manual scheduler runs, status, logs, generated data, degraded states, and deletion paths are documented.
* `bun run aggregate` remains the direct path and `bun run scheduler:run` remains the scheduler-wrapped path.
* Dream output paths, Dream provider boundaries, and material-first execution are documented as implemented scheduler behavior; production timer installation remains local machine setup.
* No browser gateway, live scheduler API, or mutable UI control is described as shipped.
* Validation evidence uses git metadata and safe command summaries instead of reading or pasting private artifact contents.

The latest scheduler split live validation recorded on 2026-05-31 ran `scheduler:agents:run`, `scheduler:trend-finder:run`, `scheduler:dream:run`, `scheduler:run`, and direct `aggregate` successfully. Post-run status commands for `agent-aggregate`, `trend-finder`, `dream`, and `aggregate` returned command-only browser-safe output. `bun run test`, `bun run typecheck`, `bun run typecheck:scripts`, `bun run lint`, `bun run format:check`, `bun run build`, `bun run runtime:check-private`, `bun run budget:check`, `bun run test:e2e`, and `git diff --check` passed in that audit.

The same audit recorded a coverage-only red gate: `bun run test:coverage` executed the full test suite successfully but missed the configured global coverage thresholds with 86.63% statements, 89.4% lines, 92.18% functions, and 77.04% branches. Do not silently lower thresholds; either add coverage or document an intentional threshold reset before treating the coverage gate as green.

Remaining optional scheduler coverage gap: source/provider readiness has focused warning and degraded-path coverage, but not a dedicated live collector matrix for every missing and partial source/provider configuration.

## Coverage Policy

Coverage counts **all first-party source files**, not only files that tests happen to import. The denominator is stable regardless of what tests exist.

### Counted

* `src/**/*.{ts,tsx}`
* `scripts/lib/**/*.ts`

### Excluded

* Test files (`__tests__/` directories)
* Generated files (`src/routeTree.gen.ts`)
* Type-only files with no runtime behavior (`src/lib/live-data-types.ts`, `src/lib/graph-types.ts`)
* Static/mock fixture modules (`src/lib/mock-data.ts`)
* Visual-only WebGL components that require browser coverage (`src/components/memory-graph-3d.tsx`, `src/components/memory-graph-mini.tsx`)
* Browser-driven dashboard shells (`src/components/extensions/**`, `src/components/home/**`, `src/components/openclaw/**`, `src/components/setup/**`)
* Trend Finder extension view shells (`src/extensions/trend-finder/views/**`)
* Route layout wrappers better covered by browser tests (`src/routes/extensions.$extensionId.tsx`, `src/routes/agents.openclaw.tsx`)

### Thresholds

Thresholds are enforced in `vitest.config.ts` and ratcheted upward as coverage improves. The current enforced baseline is 85.4% statements, 88.2% lines, 91.3% functions, and 76.0% branches. This reset is intentional because runtime, source, dashboard, provenance, and generated-artifact guardrail surfaces split their highest-value checks across focused unit coverage and Playwright browser coverage.

### Reports

`bun run test:coverage` generates three reports:

* **text** -- printed to stdout
* **json** -- `coverage/coverage-final.json`
* **html** -- `coverage/index.html` (open in browser for line-by-line detail)

## Test Locations

| Directory                          | Scope                            |
| ---------------------------------- | -------------------------------- |
| `src/lib/__tests__/`               | Pure utility and library modules |
| `src/hooks/__tests__/`             | React hook tests                 |
| `src/routes/__tests__/`            | Route component behavior tests   |
| `src/components/__tests__/`        | App-specific component tests     |
| `scripts/lib/__tests__/`           | Build/script utility tests       |
| `scripts/extensions/**/__tests__/` | Extension collector tests        |
| `tests/e2e/`                       | Playwright browser regressions   |

## Writing New Tests

1. Create a file in the appropriate `__tests__/` directory named `<module>.test.ts` (or `.tsx`).
2. Import from the module using the `@/` path alias.
3. Use Arrange-Act-Assert structure for readability.
4. For pure functions: test real behavior with deterministic inputs. No mocks needed.
5. For React components: add `// @vitest-environment happy-dom` at the top of the file, use `renderRoute()` from `src/routes/__tests__/test-utils.tsx`.
6. For modules with side effects (like `error-capture.ts`): use `vi.resetModules()` and dynamic `import()` in `beforeEach` to isolate module-level state between tests.

### Example

```typescript
import { describe, expect, it } from "vitest";
import { myFunction } from "@/lib/my-module";

describe("myFunction", () => {
  it("handles the expected case", () => {
    const result = myFunction("input");
    expect(result).toBe("expected output");
  });
});
```

## Conventions

* Tests must be deterministic -- no network, filesystem, or localStorage dependencies without explicit mocks.
* All test files must use ASCII-only characters and Unix LF line endings.
* Tests must not interfere with `bun run dev` or `bun run build`.
* Coverage thresholds must not be lowered without documenting the reason.
* Browser tests should assert meaningful rendered behavior, especially for WebGL paths excluded from happy-dom unit coverage.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/testing.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.