> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase40-session16-voice-parity-and-broker-respawn/spec.md).

# Session Specification

**Session ID**: `phase40-session16-voice-parity-and-broker-respawn` **Phase**: 40 - Claude OS v2.10.1 Semantic Port **Status**: Not Started **Created**: 2026-07-03 **Base Commit**: 47eb56c7d668ed54fc6e425246e14467a3058e45

***

## 1. Session Overview

This session verifies the upstream "do not reprompt every session" voice outcome in AI OS without porting upstream browser OpenAI key persistence. The upstream implementation persists a browser key and posts it to its start endpoint; AI OS already owns a safer local broker model where `/__start_voice` reads `OPENAI_API_KEY` and optional `OPENAI_BASE_URL` from ignored local environment only.

The work is intentionally narrow: compare the upstream saved-key behavior to current AI OS broker startup, prove or repair the local launch bridge and hook error paths, and record the documentation notes that Session 17 should apply. No broad Intelligence portal redesign, new voice provider UI, or browser key storage should be added.

This is next because Phase 40 sessions 1-15 are complete and Session 16 is the remaining Voice parity gate before docs closeout and full validation.

***

## 2. Objectives

1. Record a voice parity audit that compares upstream browser saved-key reuse with AI OS environment-backed broker startup.
2. Prove `/__start_voice` accepts no provider config from the browser, reads provider settings from local environment only, and preserves distinct bridge error codes.
3. Prove the Hermes Intelligence voice hook starts the broker before showing setup/error UI and maps broker, token, timeout, missing-key, and provider failures distinctly.
4. Queue precise Session 17 documentation notes for the shipped voice behavior without making docs claims before the implementation evidence exists.

***

## 3. Prerequisites

### Required Sessions

* [x] `phase40-session01-baseline-and-port-invariants` - Recorded `INV-014`, `DEC-004`, `DEC-009`, `MAP-016`, and `CLS-015`/`CLS-022` for voice credential handling and broker ownership.

### Required Tools Or Knowledge

* Phase 40 PRD and Session 16 stub under `.spec_system/PRD/phase_40/`.
* Upstream saved-key reference in `<upstream-checkout>/src/components/intelligence-portal.tsx`.
* Current AI OS voice owners: `scripts/lib/voice-launch-bridge.ts`, `scripts/lib/voice-broker.ts`, `src/hooks/use-hermes-intelligence-voice.ts`, and `src/components/hermes/intelligence/intelligence-portal.tsx`.

### Environment Requirements

* Local checkout with Bun 1.3.14 dependencies installed or installable.
* No live OpenAI Realtime credentials required; mocked broker/provider tests are sufficient unless safe local credentials are already configured.

***

## 4. Scope

### In Scope (MVP)

* AI OS maintainers get an audit record explaining why upstream browser key persistence is an intentional non-port and how no-reprompt parity is achieved through the local broker.
* AI OS operators can start Voice from Hermes Intelligence without an unnecessary setup reprompt when the broker is down but `OPENAI_API_KEY` is locally configured for the dev server.
* AI OS local control-plane behavior rejects browser-supplied provider keys, browser-supplied base URLs, oversized/invalid launch bodies, invalid tokens, hostile Host headers, non-loopback callers, duplicate starts, spawn failure, and health timeout with distinct safe errors.
* AI OS hook and portal tests cover missing key, bad token, broker unavailable, health timeout, provider/session failure, offline, mic denial, and active voice-loop controls where each behavior is owned.
* Session 17 receives concrete notes for docs that should stay aligned with the actual broker behavior and Session 08/Session 09 ownership caveats.

### Out Of Scope (Deferred)

* Browser localStorage, sessionStorage, URL, or request-body persistence for provider keys - Reason: violates `INV-014`, `DEC-004`, and the Phase 40 risk mitigation.
* Accepting provider keys or provider base URLs in `/__start_voice` request bodies - Reason: provider config remains environment-only.
* Broad Hermes Intelligence feature work or visual redesign - Reason: Session 16 is a voice parity and broker-respawn verification gate.
* Updating general product/API docs directly - Reason: Phase 40 Session 17 owns documentation closeout after implementation evidence exists.
* Claiming live Realtime provider proof without safe local credentials - Reason: Phase 38 and current docs require mocked evidence to stay labeled.

***

## 5. Technical Approach

### Architecture

Keep AI OS voice ownership split across the existing bridge, broker, hook, and portal files. The bridge remains a Vite-only local control-plane route that can spawn the broker with environment-provided credentials and the same-run token. The browser hook may call `/__start_voice`, `/api/session`, and Realtime using only the same-run token and short-lived session credential returned by the broker.

Use tests as the primary parity proof. Add focused bridge tests for launch payload rejection, empty launch acceptance, env-only spawn, duplicate-start locking, health timeout cleanup, invalid token, and provider config rejection. Add hook/component tests for broker respawn before setup, error mapping, active controls, and product-facing recovery states. Modify production code only where those tests expose a gap.

### Design Patterns

* Semantic non-port: Preserve the upstream outcome while rejecting the upstream storage and request-body mechanism.
* Existing owner mapping: Put bridge behavior in `scripts/lib/`, hook behavior in `src/hooks/`, and user-facing recovery state in the Intelligence portal.
* Boundary-first verification: Test loopback, Host, token, body-size, provider config, env-only spawn, timeout, and safe-error gates before UI assertions.
* Evidence-first docs queue: Record Session 17 notes in session artifacts, then let the docs closeout session update public documentation.

***

## 6. Deliverables

### Files To Create

| File                                                                                           | Purpose                                                   | Est. Lines |
| ---------------------------------------------------------------------------------------------- | --------------------------------------------------------- | ---------- |
| `.spec_system/specs/phase40-session16-voice-parity-and-broker-respawn/implementation-notes.md` | Voice parity audit, fix notes, test evidence, docs queue. | \~180      |

### Files To Modify

| File                                                                            | Changes                                                                  | Est. Lines |
| ------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | ---------- |
| `scripts/lib/__tests__/voice-launch-bridge.test.ts`                             | Add or tighten bridge respawn, empty-body, env-only, and error tests.    | \~80       |
| `scripts/lib/voice-launch-bridge.ts`                                            | Minimal fix only if tests reveal launch/body/error/respawn gaps.         | \~30       |
| `scripts/lib/voice-broker.ts`                                                   | Minimal fix only if provider/base policy tests reveal a regression.      | \~20       |
| `src/hooks/__tests__/use-hermes-intelligence-voice.test.tsx`                    | Add or tighten start-bridge, payload, respawn, and error-mapping tests.  | \~80       |
| `src/hooks/use-hermes-intelligence-voice.ts`                                    | Minimal fix only if tests reveal hook start or error mapping gaps.       | \~30       |
| `src/components/hermes/intelligence/__tests__/intelligence-portal.test.tsx`     | Add or tighten recovery and active-control coverage.                     | \~50       |
| `src/components/hermes/intelligence/intelligence-portal.tsx`                    | Minimal fix only if tests reveal product recovery or control-state gaps. | \~20       |
| `.spec_system/specs/phase40-session16-voice-parity-and-broker-respawn/tasks.md` | Mark implementation tasks and final evidence during the session.         | \~30       |

***

## 7. Success Criteria

### Functional Requirements

* [ ] Voice parity audit records upstream saved-key behavior, AI OS environment-backed replacement behavior, and intentional non-port rationale.
* [ ] `/__start_voice` accepts an empty launch without provider config and rejects browser-supplied key/base payloads.
* [ ] Broker-down plus locally configured `OPENAI_API_KEY` can spawn the broker without opening setup solely because the broker was down.
* [ ] Missing key still produces setup/error recovery state.
* [ ] Invalid same-run token is distinguishable from missing key, timeout, spawn failure, and provider/session failure.
* [ ] Browser code does not send provider key or provider base URL to `/__start_voice`.
* [ ] Session 17 documentation notes identify exact shipped behavior and caveats to keep docs honest.

### Testing Requirements

* [ ] Focused bridge and broker tests pass.
* [ ] Focused hook and Intelligence portal tests pass.
* [ ] Local control-plane guard and sanitizer-adjacent tests pass.
* [ ] Typecheck, script typecheck, lint, and diff whitespace checks pass.
* [ ] ASCII/LF and sensitive-string scans pass for touched session and voice files.

### Non-Functional Requirements

* [ ] Provider keys remain environment-only and never enter browser storage, request bodies, docs examples, argv, logs, generated data, or fixtures.
* [ ] Voice recovery copy is product-facing and contains no raw private paths, tokens, account IDs, provider payloads, stack traces, or secret-shaped strings.
* [ ] Local control-plane gates keep loopback, Host-header, token, method, body-size, timeout, and safe-error protections intact.

### Quality Gates

* [ ] All files ASCII-encoded
* [ ] Unix LF line endings
* [ ] Code follows project conventions
* [ ] Primary user-facing surfaces contain product-facing copy only

***

## 8. Implementation Notes

### Working Assumptions

* Mocked broker/provider tests are acceptable parity proof for this session: Phase 38 and current voice docs state live OpenAI Realtime proof requires configured credentials and must not be claimed from mocked runs. Planning can proceed because Session 16 is a broker/hook parity gate, not a live-provider credential exercise.
* AI OS "no unnecessary reprompt" parity means broker respawn from local environment configuration, not browser saved-key reuse: Session 01 recorded `DEC-004` and `INV-014`, and docs already state provider keys must never move through browser localStorage or request bodies.

### Conflict Resolutions

* The Phase 40 PRD says the launch request body "must be empty," while current AI OS code sends an empty JSON object from the hook and rejects only non-empty provider config bodies. The chosen interpretation is that no browser provider config may be supplied; implementation should test a truly empty body at the bridge layer and preserve empty-object compatibility only if no provider fields are present.

### Key Considerations

* Preserve "Hermes Intelligence" as the portal/surface name and use "Voice" only for speech-specific controls and recovery copy.
* Treat upstream browser key persistence as an intentional non-port while preserving the upstream outcome where AI OS can do so safely.
* If current behavior already satisfies a requirement, record evidence instead of changing production code.

### Potential Challenges

* Error-code drift: Bridge and hook code use different code enums. Mitigation: add focused mapping tests for each required failure class.
* False live-proof claims: Safe test runs may mock the broker and provider. Mitigation: record mocked evidence as mocked and queue docs language that distinguishes live credential proof.
* Secret-shaped fixtures: Tests need key-like concepts without real key patterns. Mitigation: keep dummy values shorter than real key formats.

### Relevant Considerations

* \[P38] **OpenAI Realtime keys stay environment-only**: Provider credentials must never come from browser bodies, argv, docs, fixtures, or generated data.
* \[P38] **Local control-plane gates are defense in depth**: Preserve socket loopback, Host-header, token/admin, method, body-size, schema, timeout, and safe-error checks.
* \[P38] **Upstream ports are semantic, not wholesale**: Map upstream behavior to AI OS module owners and record not-ported rationale.
* \[P02] **Do not expose secret env values through browser state**: Key names and presence booleans are the safe browser boundary.

### Behavioral Quality Focus

Checklist active: Yes Top behavioral risks for this session:

* Broker launch is a state-mutating local control-plane action and must keep duplicate-trigger prevention, safe failures, and cleanup on timeout.
* The hook crosses bridge, broker, provider, microphone, and WebRTC boundaries and must clean up acquired resources on every failure path.
* The Intelligence portal is user-facing and must show product-facing recovery states without leaking diagnostics or weakening accessibility.

***

## 9. Testing Strategy

### Unit Tests

* `scripts/lib/__tests__/voice-launch-bridge.test.ts` for empty launch, provider config rejection, env-only spawn, duplicate start, spawn failure, health timeout, invalid token, and no key/base in argv.
* `scripts/lib/__tests__/voice-broker.test.ts` for provider/base policy and session error distinctions if broker behavior changes.
* `src/hooks/__tests__/use-hermes-intelligence-voice.test.tsx` for bridge start ordering, request payload shape, error mapping, cleanup, and no duplicate starts.

### Integration Tests

* `src/components/hermes/intelligence/__tests__/intelligence-portal.test.tsx` for recovery states, active voice controls, typed prompt path, and product copy boundaries.
* Local control-plane guard and sanitizer-adjacent tests to preserve route gates and safe errors.

### Runtime Verification

* Run the focused Vitest slots for bridge/broker/hook/component behavior.
* If safe local credentials are already configured, optionally record a local `/agents/hermes?intel=1` smoke as live-provider evidence; otherwise record mocked/no-credential evidence only.

### Edge Cases

* Empty request body versus empty JSON object launch compatibility.
* Non-empty browser provider config body.
* Missing `OPENAI_API_KEY`.
* Rejected same-run token.
* Health timeout after child spawn.
* Provider/session failure after broker startup.
* Offline browser and microphone denial.

***

## 10. Dependencies

### Other Sessions

* Depends on: `phase40-session01-baseline-and-port-invariants`
* Follows in phase order: `phase40-session15-ministry-config-analytics-and-save-ux`
* Depended by: `phase40-session17-docs-metadata-and-gitignore-closeout`, `phase40-session18-full-validation-and-handoff`

***

## Next Steps

Run the `implement` workflow step to begin implementation.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase40-session16-voice-parity-and-broker-respawn/spec.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
