> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/local-voice-setup.md).

# Local Voice Setup

This document records the current AI OS voice policy after Phase 40 Session 16. It covers the local broker plus the Hermes Intelligence portal that consumes it.

## Current Status

AI OS ships a local loopback voice broker command:

```bash
bun run voice
```

The broker listens on `127.0.0.1:8099` by default and exposes only:

* `GET /api/health` for safe readiness metadata.
* `POST /api/session` for same-run-token-gated Realtime session credentials.

AI OS also includes a `.claude/launch.json` `voice-lab` target and a Vite `POST /__start_voice` route that can start the broker from the local dev server when `OPENAI_API_KEY` is configured in ignored local environment. `/__start_voice` accepts an empty request body or `{}`. Any browser body with fields is treated as an attempted provider configuration and rejected.

The Hermes Intelligence portal is available from `/agents/hermes` through the Chat tab launcher or direct `?intel=1` entry state. It can start the broker, request a Realtime session with the same-run token, handle Realtime `ask_hermes` tool calls, and route those calls through the existing `/__hermes_chat` SSE bridge.

The local product access default is full local write/edit readiness. The current `/__hermes_chat` dependency still follows the Phase 40 Hermes write gate until the all-access migration replaces manual admin opt-in. Do not describe live voice as fully delivered unless the local path reaches real provider-backed execution, visible transcript/results, recovery states, and tests.

Current no-reprompt parity means AI OS can respawn the broker from ignored local environment when the dev server already has `OPENAI_API_KEY`. It does not mean the browser stores or resubmits an OpenAI key.

AI OS still does not ship the upstream standalone voice demo or `/api/sample` TTS path. The product surface is the Hermes Intelligence portal.

## Provider Policy

Voice provider configuration is environment-only:

* Put real provider keys in `.env.local`, shell environment, or another ignored local secret store.
* Use `voice-lab/.env.example` only as a short-placeholder template.
* Do not store provider keys in browser localStorage, sessionStorage, query strings, committed fixtures, docs, or request bodies.
* `POST /__start_voice` reads `OPENAI_API_KEY` and optional `OPENAI_BASE_URL` from the server environment and passes them to the broker child process by environment only.
* Browser code sends `{}` to `/__start_voice`; it never sends provider keys, provider base URLs, or provider config.
* Browser code sends only `voice` and `mode` to `POST /api/session`, plus the same-run token in the `X-Claude-OS-Token` header.
* Browser code may receive only short-lived, scoped voice session credentials from `POST /api/session`.
* Logs and errors must not expose provider keys, bearer tokens, local usernames, home-directory paths, prompts, transcripts, raw provider responses, or secret-shaped strings.

Use placeholder values shorter than real key patterns in docs and tests, for example `OPENAI_API_KEY=key`.

## Broker Token

`POST /api/session` requires the `X-Claude-OS-Token` header to match the broker token.

When Vite starts the broker through `POST /__start_voice`, it passes the current same-run dev token to the child as `AI_OS_VOICE_TOKEN`. When running `bun run voice` directly for local smoke tests, set `AI_OS_VOICE_TOKEN` to a short local value and send the same value in `X-Claude-OS-Token`.

The same-run token is a local bridge gate, not a provider credential. Do not save it in docs, fixtures, browser storage, or reusable scripts.

## `OPENAI_BASE_URL` Policy

`OPENAI_BASE_URL` defaults to:

* `https://api.openai.com`

Optional overrides are accepted only when they are compatible loopback bases:

* `http://127.0.0.1:<port>`
* `http://localhost:<port>`
* `http://[::1]:<port>`
* HTTPS variants of those loopback hosts.

Remote non-OpenAI hosts are rejected. Userinfo, query strings, fragments, and arbitrary remote domains are not allowed because a provider key must never be redirected to an untrusted host.

The current repository does not expose a UI for setting `OPENAI_BASE_URL` and does not persist it in browser storage.

## Verification

Local health smoke:

```bash
bun run voice
curl http://127.0.0.1:8099/api/health
```

Configured session mint smoke, when provider credentials are available:

```bash
AI_OS_VOICE_TOKEN=voice-token bun run voice
curl -X POST http://127.0.0.1:8099/api/session \
  -H 'Content-Type: application/json' \
  -H 'X-Claude-OS-Token: voice-token' \
  --data '{"voice":"marin","mode":"companion"}'
```

The response may include a short-lived Realtime credential. Do not copy that credential into docs, fixtures, logs, or screenshots.

Portal smoke path:

```
/agents/hermes?intel=1
```

In local live mode, the Start voice control uses `POST /__start_voice`, then `POST http://127.0.0.1:8099/api/session`, then returns `ask_hermes` calls to `/__hermes_chat`. If provider credentials are unavailable, record only the controlled recovery state. Mocked broker or provider tests prove the boundary behavior, not a real live provider result. Do not claim a real spoken provider result without a safe local credential-backed run.

## Local Speech Engines

Local speech engines are allowed only behind an OpenAI-compatible loopback base URL. The local engine must implement the Realtime session credential contract used by the broker before AI OS can advertise it as a working voice path.

## Upstream Skips

The upstream browser key persistence flow is intentionally not ported. AI OS preserves the no-unnecessary-reprompt outcome through environment-backed broker respawn and same-run tokens.

The upstream standalone `voice-lab/index.html` demo, `/api/sample` TTS path, obsolete `src/components/agent-core-3d.tsx`, and internal `INTEGRATION.md` merge notes are not AI OS product surfaces.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/local-voice-setup.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
