> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase38-session08-voice-broker/spec.md).

# Session Specification

**Session ID**: `phase38-session08-voice-broker` **Phase**: 38 - Claude OS v2.8.1 Semantic Port **Status**: Not Started **Created**: 2026-06-30

***

## 1. Session Overview

This session adds the local OpenAI Realtime voice token broker required before the Hermes Intelligence portal can ship. It is next because deterministic project analysis shows Phase 38 Sessions 01 through 07 complete, no active session, and Session 08 as the first unfinished candidate.

The work adapts upstream `voice-lab/server.ts` into an AI OS broker-only implementation. The broker must mint short-lived Realtime session credentials from server-side environment configuration, expose safe health metadata, honor the approved `OPENAI_BASE_URL` policy, and reject bad origin, bad token, missing key, non-local Host, and non-loopback launch requests with controlled errors.

The session deliberately does not ship the Intelligence portal, standalone voice demo, sample TTS endpoint, browser-stored provider keys, or a second Hermes chat backend. It prepares the voice transport and launch target that Session 09 will use from the existing Hermes chat surface.

***

## 2. Objectives

1. Add a loopback-only `voice-lab/server.ts` broker that mints Realtime session tokens with provider keys read from environment only.
2. Add guarded `/__start_voice` Vite middleware that starts the broker with env-only provider configuration and no key material in argv or browser state.
3. Add runnable package and launch targets only after the broker command exists and is covered by focused security tests.
4. Update current-state voice and Intelligence docs so they describe the shipped broker boundary without claiming Session 09 portal behavior.

***

## 3. Prerequisites

### Required Sessions

* [x] `phase38-session05-runtime-bridge-hardening` - Provides loopback, Host-header, token, body-size, and bridge hardening patterns for local control-plane routes.
* [x] `phase38-session06-policy-docs-and-catalogs` - Records env-only voice provider policy, `OPENAI_BASE_URL` allowlist requirements, and the prior decision not to add a broken voice launch target.
* [x] `phase38-session07-dream-engine-product-integration` - Completes the preceding Phase 38 product integration dependency so voice work is now the first executable candidate.

### Required Tools Or Knowledge

* Bun 1.3.14, Vitest, TypeScript script typechecking, Vite middleware conventions, `rg`, and the existing local control-plane guard helpers.
* Upstream voice files under `/home/aiwithapex/projects/claudeos/claude-os-v2.8.1/voice-lab/`.
* Current AI OS bridge patterns in `vite.config.ts`, `scripts/lib/local-control-plane-guard.ts`, and `scripts/lib/hermes-admin-bridge.ts`.

### Environment Requirements

* Repository root is `/home/aiwithapex/projects/aios`.
* `OPENAI_API_KEY` and optional `OPENAI_BASE_URL` must be read from ignored local environment only.
* Provider keys, bearer headers, raw provider responses, prompts, transcripts, token-shaped strings, local usernames, and home-directory paths must stay out of browser state, committed fixtures, docs, logs, and tests.

***

## 4. Scope

### In Scope (MVP)

* Local operators can start the broker with `bun run voice` - create `voice-lab/server.ts`, `voice-lab/.env.example`, the `voice` package script, and a runnable `.claude/launch.json` target.
* Browser code can request a Realtime session credential from the local broker when configured - implement `/api/health` and `/api/session` with token-gated session minting and safe health fields.
* Local product code can start the broker from Vite - add `/__start_voice` middleware that is loopback, Host-header, and same-run-token gated, starts one broker process, polls health, and passes provider configuration by environment only.
* OpenAI Realtime is the default voice path - use `https://api.openai.com` when `OPENAI_BASE_URL` is absent and allow compatible loopback/proxy endpoints only when they pass the Phase 38 allowlist.
* Session 09 can build on a truthful contract - update voice and Intelligence docs to say the broker exists while the portal and visualizers remain Session 09-owned.

### Out Of Scope (Deferred)

* Copying upstream `voice-lab/index.html` - Reason: AI OS does not ship a disconnected voice demo, and Session 09 owns the real portal surface.
* Shipping upstream `/api/sample` TTS behavior - Reason: Session 08 is scoped to token broker behavior only.
* Accepting provider keys from request bodies or browser storage - Reason: Phase 38 policy requires environment-only provider configuration.
* Wiring voice `ask_hermes` tool calls to `/__hermes_chat` - Reason: Session 09 owns the portal tool-call bridge and real chat loop.
* Adding remote non-OpenAI provider bases - Reason: `OPENAI_BASE_URL` must not redirect provider keys to arbitrary hosts.

***

## 5. Technical Approach

### Architecture

Create a small broker helper under `scripts/lib/voice-broker.ts` for request parsing, origin and Host validation, base URL allowlisting, provider request construction, safe response shaping, and error redaction. `voice-lab/server.ts` should be a thin Bun server that binds `127.0.0.1`, uses that helper, serves `/api/health`, and mints `/api/session` credentials without accepting long-lived keys from the browser.

Create a Vite bridge helper under `scripts/lib/voice-launch-bridge.ts` that follows existing `register*Bridge` patterns. `vite.config.ts` should register `/__start_voice` with the shared loopback and Host-header guard, same-run token, safe JSON responses, one-process lifecycle handling, and health polling against `http://127.0.0.1:8099/api/health`. The helper should pass `OPENAI_API_KEY`, optional `OPENAI_BASE_URL`, `PORT`, and the same-run voice token by environment only, never argv or browser state.

Keep docs current-state oriented. `docs/local-voice-setup.md` should move from "not shipped" to "broker shipped, portal still pending" only after the command and tests exist. `docs/intelligence-view.md` should continue to state that the portal, visualizers, and live voice loop are Session 09-owned.

### Design Patterns

* Local control-plane registration: reuse existing Vite bridge helper shape instead of placing all launch logic inline.
* Pure endpoint helpers: test request, rejection, provider, and response behavior without needing real provider credentials.
* Environment-only secrets: provider keys and base URL come from local env and are passed to child processes through env only.
* Browser-safe projection: health responses expose readiness, keyed boolean, sanitized base URL, and recovery codes only.

***

## 6. Deliverables

### Files To Create

| File                                                                        | Purpose                                                                                         | Est. Lines |
| --------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- | ---------- |
| `scripts/lib/voice-broker.ts`                                               | Broker request helpers, base allowlist, origin and Host checks, Realtime request building.      | \~240      |
| `scripts/lib/voice-launch-bridge.ts`                                        | Register `/__start_voice`, manage broker process lifecycle, health polling, and safe responses. | \~220      |
| `voice-lab/server.ts`                                                       | Loopback Bun broker with `/api/health` and `/api/session` only.                                 | \~160      |
| `voice-lab/.env.example`                                                    | Short placeholder env example for `OPENAI_API_KEY`, `OPENAI_BASE_URL`, and `PORT`.              | \~25       |
| `scripts/lib/__tests__/voice-broker.test.ts`                                | Tests for health, token, origin, Host, missing key, base allowlist, and provider failures.      | \~220      |
| `scripts/lib/__tests__/voice-launch-bridge.test.ts`                         | Tests for `/__start_voice` guards, env-only spawn, idempotent start, and health timeout.        | \~220      |
| `.spec_system/specs/phase38-session08-voice-broker/implementation-notes.md` | Evidence log for upstream skips, broker smoke, security checks, and configured-token proof.     | \~140      |

### Files To Modify

| File                                                      | Changes                                                                                          | Est. Lines |
| --------------------------------------------------------- | ------------------------------------------------------------------------------------------------ | ---------- |
| `package.json`                                            | Add `voice` script that runs `voice-lab/server.ts`.                                              | \~5        |
| `tsconfig.scripts.json`                                   | Include `voice-lab/**/*.ts` in script typechecking.                                              | \~5        |
| `vite.config.ts`                                          | Import and register the voice launch bridge using existing token, env reader, and guard helpers. | \~45       |
| `.claude/launch.json`                                     | Add a runnable `voice-lab` launch target for port 8099.                                          | \~10       |
| `docs/local-voice-setup.md`                               | Update current-state broker setup, env-only provider policy, and verification checklist.         | \~80       |
| `docs/intelligence-view.md`                               | Note that broker transport exists while portal/visualizers remain Session 09-owned.              | \~25       |
| `scripts/lib/__tests__/local-control-plane-guard.test.ts` | Add `/__start_voice` to representative privileged endpoint guard coverage.                       | \~5        |

***

## 7. Success Criteria

### Functional Requirements

* [ ] `bun run voice` starts a broker whose loopback health endpoint returns `ok: true`, keyed status, and the sanitized base URL.
* [ ] `POST /api/session` mints a Realtime session credential when configured with a valid key and same-run token.
* [ ] Missing key, bad token, hostile origin, hostile Host header, invalid base URL, provider failure, and wrong method return controlled errors.
* [ ] `POST /__start_voice` starts the broker through Vite without exposing keys in argv, browser state, logs, or committed files.
* [ ] `OPENAI_BASE_URL` defaults to `https://api.openai.com` and accepts only approved OpenAI or loopback-compatible endpoints.
* [ ] Upstream `voice-lab/index.html` and `/api/sample` are recorded as skipped, with Session 09 named as the real product surface.

### Testing Requirements

* [ ] Unit tests cover broker health, request parsing, token checks, origin and Host rejection, base allowlist, missing key, provider success, provider failure, and safe error mapping.
* [ ] Bridge tests cover `/__start_voice` method, loopback, Host-header, token, env-only spawn, already-running, health timeout, and no-key states.
* [ ] Guard tests include `/__start_voice` as a representative privileged Vite endpoint.
* [ ] Runtime verification covers `bun run voice`, health polling, and a configured token-mint path when provider credentials are available.

### Non-Functional Requirements

* [ ] All files are ASCII-encoded with Unix LF line endings.
* [ ] Code follows project conventions and existing bridge patterns.
* [ ] No provider key, bearer token, raw provider response, prompt, transcript, local username, or private path is exposed to browser state, docs, logs, fixtures, or tests.
* [ ] Public demo behavior remains unchanged and does not call local voice bridge routes.

### Quality Gates

* [ ] `bun run test -- scripts/lib/__tests__/voice-broker.test.ts scripts/lib/__tests__/voice-launch-bridge.test.ts scripts/lib/__tests__/local-control-plane-guard.test.ts`
* [ ] `bun run typecheck:scripts`
* [ ] `bun run lint`
* [ ] ASCII/LF and secret-pattern sweeps pass for session deliverables.

***

## 8. Implementation Notes

### Working Assumptions

* `voice-lab/server.ts` should be included in `typecheck:scripts`: the current `tsconfig.scripts.json` includes only `scripts/**/*.ts`, so adding `voice-lab/**/*.ts` is the smallest way to typecheck the new broker command.
* The broker token should be the Vite same-run token when launched from `/__start_voice`: existing AI OS local writes use `X-Claude-OS-Token`, and Session 08 requires bad-token rejection without creating a separate browser token distribution path.
* `voice-lab/.env.example` should use short placeholder values: repo guidance requires placeholders shorter than real key patterns, and Phase 06 docs already use `OPENAI_API_KEY=key`.

### Conflict Resolutions

* Upstream `voice-lab/server.ts` accepts a browser-supplied `key` and exposes `/api/sample`; the Phase 38 Session 08 stub and `docs/local-voice-setup.md` require a token broker only with provider keys by environment. The AI OS plan follows the Phase 38 policy and skips browser-supplied keys plus sample TTS.
* Upstream `/__start_voice` reads `key` and `base` from the request body; AI OS policy says provider keys and base URL must not be browser-configured. The AI OS route will read provider env from the server process and pass it to the child environment only.
* Upstream patch line references for the Vite route are stale relative to the current upstream file, but targeted search found `/__start_voice` around the current upstream Vite voice block. Planning can proceed because the session stub defines the required AI OS route behavior.

### Key Considerations

* The security record has no open findings, so this session should preserve that posture by treating voice as new local control-plane work.
* The broker is a prerequisite for Session 09, not the portal itself.
* Documentation must not claim spoken Hermes answers or visualizers until Session 09 wires the portal to real `/__hermes_chat` events.

### Potential Challenges

* Real Realtime token proof needs credentials: use focused provider-mock tests for automation and record an exact configured-run result when credentials are present.
* Long-running broker lifecycle can leak processes: keep `/__start_voice` idempotent, poll health, and kill or replace stale failed children.
* Provider errors can leak payloads: sanitize response text and expose stable error codes rather than raw upstream bodies.

### Relevant Considerations

* \[P04] **Hermes bridge guardrails must stay intact**: voice launch must keep loopback, token, Host-header, and disabled-by-default admin boundaries.
* \[P00] **Stack conventions**: use Bun, Vite 8, TypeScript, and Vitest without adding runtime dependencies.
* \[P21] **Token-shaped strings stay out of browser state and logs**: apply the same redaction posture to provider keys and Realtime credentials.
* \[P02] **Do not expose secret env values through browser state**: expose presence/readiness only, not values.

### Behavioral Quality Focus

Checklist active: Yes Top behavioral risks for this session:

* Broker and launch endpoints are external input handlers and must use schema-validated input, local guards, and explicit error mapping.
* `/__start_voice` is a state-mutating process action and must prevent duplicate starts while launch is in flight.
* OpenAI Realtime token minting is an external system call and must handle timeout, provider failure, invalid credentials, and sanitized recovery paths.

***

## 9. Testing Strategy

### Unit Tests

* Test broker helpers for origin, Host, token, base allowlist, health payload, request shape, missing key, provider success, provider failure, and redaction.
* Test launch bridge helpers for method rejection, non-loopback rejection, hostile Host rejection, bad token, env-only spawn, already-running health, spawn failure, and timeout.

### Integration Tests

* Add the voice route to representative local control-plane guard coverage.
* Run targeted Vitest files for broker, launch bridge, and control-plane guard behavior.

### Runtime Verification

* Run `bun run voice` and verify `http://127.0.0.1:8099/api/health` returns a controlled health payload.
* When `OPENAI_API_KEY` is configured, verify a real browser or local fetch can mint a Realtime session token through the broker using the same-run token.

### Edge Cases

* Missing `OPENAI_API_KEY`.
* Invalid or arbitrary remote `OPENAI_BASE_URL`.
* Hostile `Origin` and hostile `Host` headers.
* Wrong method and malformed JSON body.
* Provider timeout, provider 401, provider malformed JSON, and provider body containing secret-shaped text.
* Existing broker already running, stale child process, and health timeout.

***

## 10. Dependencies

### Other Sessions

* Depends on: `phase38-session05-runtime-bridge-hardening`, `phase38-session06-policy-docs-and-catalogs`, `phase38-session07-dream-engine-product-integration`
* Depended by: `phase38-session09-intelligence-portal`, `phase38-session10-hunk-reconciliation-and-release-gate`

***

## Next Steps

Run the `implement` workflow step to begin implementation.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase38-session08-voice-broker/spec.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
