> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/specs/phase41-session05-voice-token-bootstrap/spec.md).

# Session Specification

**Session ID**: `phase41-session05-voice-token-bootstrap` **Phase**: 41 - Hermes All-Access Remediation **Status**: Not Started **Created**: 2026-07-03 **Base Commit**: 259c2457723e5b2b6063eb568fc8c1ca4ba49d83

***

## 1. Session Overview

This session makes the Hermes Intelligence voice broker token-ready in normal local development without requiring the operator to set `AI_OS_VOICE_TOKEN` manually. It covers both Start Voice through `/__start_voice` and direct `bun run voice` startup from a terminal.

This is next because Phase 41 sessions 01 through 03 already established the local all-access startup and Hermes live-local hook contract, while Session 06 depends on voice being token-ready before it changes Intelligence action readiness and copy. The session keeps the existing local control-plane defenses: loopback checks, same-run token checks, env-only provider keys, bounded bodies, allowed provider bases, safe errors, and no key material in argv or browser payloads.

The result is a voice path that either starts with an automatic local token or reports a named recoverable dependency failure for missing provider keys, invalid bases, rejected request tokens, unavailable brokers, provider failures, microphone/browser credential issues, or health timeouts.

***

## 2. Objectives

1. Make direct `bun run voice` resolve a session token automatically without manual `AI_OS_VOICE_TOKEN` setup.
2. Keep `/__start_voice` loopback and same-run-token protected while passing the Vite same-run token to spawned broker processes.
3. Preserve provider-key, browser credential, broker, loopback, token, timeout, and provider failure states as explicit recoverable failures.
4. Add focused tests proving direct token bootstrap, launch bridge protection, broker health/session behavior, and voice hook failure mapping.

***

## 3. Prerequisites

### Required Sessions

* [x] `phase41-session01-local-access-startup-contract` - Provides the normal local all-access startup default and legacy alias posture.
* [x] `phase41-session02-hermes-bridge-status` - Provides automatic Hermes bridge readiness and token status handling.
* [x] `phase41-session03-hermes-route-modes-and-hooks` - Provides the `live-local` route mode and hook write-readiness contract used by Intelligence and voice paths.
* [x] `phase40-session16-voice-parity-and-broker-respawn` - Provides the current voice launch bridge, voice broker, and voice hook baseline.

### Required Tools Or Knowledge

* Bun 1.3.14 and Vitest for focused script and hook tests.
* Existing local control-plane guard patterns in `scripts/lib/*-bridge.ts`.
* Existing same-run token file behavior from `vite.config.ts`.

### Environment Requirements

* Focused automated tests must not require real provider credentials.
* Manual live smoke requires an ignored local `OPENAI_API_KEY`; absence of that key must remain a named recoverable failure.
* All privileged voice routes must remain loopback/local Host limited.

***

## 4. Scope

### In Scope (MVP)

* Local operator can start direct `bun run voice` without manually setting `AI_OS_VOICE_TOKEN` - implement token bootstrap from environment, existing same-run local token, or generated process-local token.
* Local operator can use Start Voice through `/__start_voice` without legacy Hermes admin env setup - preserve same-run token and loopback checks while passing token readiness to the broker child.
* Maintainer can see safe voice health and recovery metadata - do not expose provider keys, bearer tokens, local private paths, raw provider bodies, or token values.
* Maintainer can run focused tests for voice launch, broker, direct command bootstrap, and hook token failure mapping - cover success and named failure paths.

### Out Of Scope (Deferred)

* Intelligence portal UI action copy and removing normal local `admin-disabled` copy - Reason: owned by Session 06.
* Live OpenAI Realtime proof with real credentials - Reason: environment-limited proof requiring ignored local provider credentials.
* Active user documentation sweep for voice setup - Reason: owned by Session 15, except for immediate `voice-lab/.env.example` comments needed to avoid preserving manual token setup in the direct command path.

***

## 5. Technical Approach

### Architecture

Add a small script-owned token bootstrap helper under `scripts/lib/` that can resolve the voice broker session token from explicit environment, the existing AI OS dev token file, or a generated process-local token. `voice-lab/server.ts` uses this helper before creating broker options, so `createVoiceBrokerHealth` sees token readiness even when `AI_OS_VOICE_TOKEN` is absent.

Keep `scripts/lib/voice-broker.ts` responsible for request validation, health projection, provider base allowlisting, and provider session creation. Extend only browser-safe health metadata if needed; never include token values.

Keep `scripts/lib/voice-launch-bridge.ts` as the Start Voice bridge owner. It continues to require loopback and the Vite same-run token on `/__start_voice`, rejects browser-supplied provider config, and spawns `bun run voice` with env-only provider config plus the Vite refresh token as `AI_OS_VOICE_TOKEN`.

### Design Patterns

* Parser-owned boundary: keep health/session payloads typed and safe before UI hooks consume them.
* Env-only secret flow: provider key and token values stay in process env or local token files, never request bodies, argv, browser storage, committed fixtures, or docs.
* Named recovery states: return stable codes such as `missing_key`, `missing_token`, `invalid_token`, `base_not_allowed`, `spawn_failed`, and `provider_failed` instead of generic disabled states.
* Thin server entry: keep `voice-lab/server.ts` small by delegating token resolution and broker behavior to tested helpers.

***

## 6. Deliverables

### Files To Create

| File                                                  | Purpose                                                                                                             | Est. Lines |
| ----------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- | ---------- |
| `scripts/lib/voice-token-bootstrap.ts`                | Resolve direct broker token source from env, dev token file, or generated process token without exposing the value. | \~120      |
| `scripts/lib/__tests__/voice-token-bootstrap.test.ts` | Cover env token, dev token, generated token, missing/unreadable token file, and safe metadata behavior.             | \~150      |

### Files To Modify

| File                                                         | Changes                                                                                                                | Est. Lines |
| ------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------- | ---------- |
| `voice-lab/server.ts`                                        | Use token bootstrap before constructing broker options and log only safe token source/readiness metadata.              | \~50       |
| `scripts/lib/voice-broker.ts`                                | Preserve token enforcement while projecting automatic token readiness and recoverable token/config failures safely.    | \~80       |
| `scripts/lib/voice-launch-bridge.ts`                         | Keep Start Voice protected and ensure spawned broker env carries the Vite same-run token without key material in argv. | \~50       |
| `scripts/lib/__tests__/voice-broker.test.ts`                 | Add direct-token readiness and failure-path coverage.                                                                  | \~90       |
| `scripts/lib/__tests__/voice-launch-bridge.test.ts`          | Add launch/spawn/existing-health/token failure coverage for token-ready broker behavior.                               | \~90       |
| `src/hooks/__tests__/use-hermes-intelligence-voice.test.tsx` | Keep voice hook failure mapping aligned with broker/launch token failures without changing Session 06 UI scope.        | \~40       |
| `voice-lab/.env.example`                                     | Remove the instruction that direct `bun run voice` requires a manual `AI_OS_VOICE_TOKEN`.                              | \~20       |

***

## 7. Success Criteria

### Functional Requirements

* [ ] `bun run voice` can start with an automatic broker session token when `AI_OS_VOICE_TOKEN` is unset.
* [ ] `/__start_voice` remains protected by loopback and same-run token checks.
* [ ] Spawned broker processes receive provider config from environment and token readiness from the Vite same-run token, with no secrets in argv.
* [ ] Missing provider key, invalid provider base, invalid/missing request token, unavailable broker, health timeout, and provider failure paths return named recoverable failures.

### Testing Requirements

* [ ] Token bootstrap unit tests written and passing.
* [ ] Voice broker and voice launch bridge tests updated and passing.
* [ ] Voice hook token/failure mapping tests updated and passing.
* [ ] `bun run typecheck:scripts` passes or reports only documented unrelated failures.

### Non-Functional Requirements

* [ ] Token values, provider keys, bearer headers, local private paths, raw provider bodies, and browser credential details are not logged, committed, returned in browser-visible health, or placed in argv.
* [ ] Direct startup and bridge startup use deterministic named recovery states that are stable for UI mapping and tests.
* [ ] No new dependency is added for token bootstrap.

### Quality Gates

* [ ] All files ASCII-encoded.
* [ ] Unix LF line endings.
* [ ] Code follows project conventions.
* [ ] Primary user-facing surfaces are not changed except where existing tests need token failure mapping; Intelligence action copy remains Session 06 scope.

***

## 8. Implementation Notes

### Working Assumptions

* Direct `bun run voice` should bootstrap a token inside the broker process, not weaken `/api/session` token enforcement: The Phase 41 PRD requires direct command token readiness while preserving token guards as automatic shipped checks, and existing broker tests assert missing/invalid request token failures.
* The existing Vite same-run token file is a valid fallback source for direct broker startup when it exists: `vite.config.ts` writes a short-lived token for local browser use, and the PRD says browser and direct local commands should receive required tokens automatically.
* A generated process-local token is acceptable when no env token or Vite token exists: It makes the broker configured rather than `missing_token_config` while still requiring clients to present the matching token for session creation.

### Key Considerations

* `OPENAI_API_KEY` absence remains a recoverable dependency failure, not a reason to disable the voice path.
* `OPENAI_BASE_URL` remains restricted to OpenAI or loopback-compatible targets.
* Browser bodies must not carry provider keys, bases, or token configuration.
* Direct command output must not print token values.

### Potential Challenges

* Generated direct tokens can make health ready while a separate browser does not know that token: keep this limited to direct broker startup and continue using the Vite same-run token for Start Voice.
* Updating health metadata can break parser expectations: keep payload changes additive and covered by strict tests.
* Refactoring `voice-lab/server.ts` can accidentally start a server during tests: keep testable token logic in `scripts/lib/voice-token-bootstrap.ts` and test the server through helper behavior rather than importing a live top-level `Bun.serve` module.

### Relevant Considerations

* \[P41] **Local access default migration**: Voice must not require manual admin or manual token setup as normal local posture.
* \[P41] **Do not accept scaffolding as delivery**: Direct and browser-launched voice paths must execute or show named recoverable failures.
* \[P38/P40] **OpenAI Realtime keys stay environment-only**: Provider keys must never come from browser bodies, argv, docs, fixtures, generated data, or browser storage.
* \[P38/P40] **Local control-plane gates are defense in depth**: Preserve loopback, Host-header, same-run token, schema, timeout, safe-error, redaction, and no-shell argv checks.
* \[P40/P41] **Write safeguards are reusable**: Keep command/process launch bounded by structured env handling and sanitized errors.

### Behavioral Quality Focus

Checklist active: Yes Top behavioral risks for this session:

* Token bootstrap could accidentally disclose a session token in logs, health payloads, fixtures, or argv.
* Direct broker startup could be marked ready without preserving request token enforcement on `/api/session`.
* Start Voice failure handling could collapse distinct provider, token, browser credential, broker, loopback, and timeout failures into an unclear generic error.

***

## 9. Testing Strategy

### Unit Tests

* `scripts/lib/__tests__/voice-token-bootstrap.test.ts`: env token, dev-token file token, generated token, missing/unreadable token file, source metadata, and no token value in serialized safe metadata.
* `scripts/lib/__tests__/voice-broker.test.ts`: health readiness with bootstrapped token, missing key recovery, invalid base recovery, missing/invalid request token enforcement, and provider failure mapping.

### Integration Tests

* `scripts/lib/__tests__/voice-launch-bridge.test.ts`: `/__start_voice` loopback/token/body validation, spawn env, existing broker health, duplicate starts, spawn failure, and health timeout cleanup.
* `src/hooks/__tests__/use-hermes-intelligence-voice.test.tsx`: hook maps launch and broker token failures distinctly and does not request session credentials after launch failure.

### Runtime Verification

* Run focused tests: `bun run test scripts/lib/__tests__/voice-token-bootstrap.test.ts scripts/lib/__tests__/voice-broker.test.ts scripts/lib/__tests__/voice-launch-bridge.test.ts src/hooks/__tests__/use-hermes-intelligence-voice.test.tsx`.
* Run `bun run typecheck:scripts`.
* Optionally run `OPENAI_API_KEY=` `bun run voice` locally long enough to confirm the health log reports token readiness and missing provider key recovery without printing token values.

### Edge Cases

* Empty or whitespace `AI_OS_VOICE_TOKEN`.
* Missing or unreadable Vite token file.
* Invalid request token even when server token is auto-generated.
* Existing broker already running before launch bridge spawn.
* Provider key missing, provider base disallowed, provider auth rejected, malformed provider response, provider timeout, and browser microphone denial.
* Duplicate Start Voice clicks while health polling is in flight.

***

## 10. Dependencies

### Other Sessions

* Depends on: `phase41-session01-local-access-startup-contract`, `phase41-session02-hermes-bridge-status`, `phase41-session03-hermes-route-modes-and-hooks`.
* Depended by: `phase41-session06-intelligence-action-access`, `phase41-session08-hermes-mutation-controls`, `phase41-session14-end-to-end-test-matrix`, `phase41-session15-active-docs-and-runbooks`, `phase41-session17-generated-data-closeout`.

***

## Next Steps

Run the `implement` workflow step to begin implementation.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/specs/phase41-session05-voice-token-bootstrap/spec.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.