> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase36-session03-enemy-and-boss-sfx-pack/spec.md).

# Session Specification

**Session ID**: `phase36-session03-enemy-and-boss-sfx-pack` **Phase**: 36 - AI Rogue Audio Asset Finishing **Status**: Not Started **Created**: 2026-06-28

***

## 1. Session Overview

This session generates, optimizes, documents, and wires the first enemy-family and boss identity SFX pack for AI Rogue. It builds directly on Session 02 metadata by turning enemy kind, actor kind, target kind, and audio intent into distinct runtime cues for fast enemies, thieves, corruption enemies, sentries, and the Kernel Sentinel boss.

This is next because Phase 36 Sessions 01 and 02 are complete: the current audio baseline has browser-path evidence, and enemy-related runtime events now carry typed metadata without changing deterministic simulation state. Session 03 is the first phase step that should commit new enemy and boss SFX files and prove those files route through real AI Rogue encounters.

The product surface is normal AI Rogue gameplay. This session should not add debug UI, remote game-content loading, hosted writes, collectors, analytics, or new simulation rules. New media remains committed local Ogg Opus SFX with provenance beside the files and fallback behavior in the audio runtime.

***

## 2. Objectives

1. Generate and commit compact enemy-family and boss SFX assets for tactical recognition during combat.
2. Extend typed cue IDs, audio file mapping, and metadata dispatch so enemy and boss events choose family-specific cues with safe generic fallbacks.
3. Preserve autoplay unlock, mute, volume preferences, deterministic variant behavior, silent fallback, and no-remote-loading boundaries.
4. Update SFX provenance and durable AI Rogue audio documentation, then verify focused tests, asset-size policy, and browser playback paths.

***

## 3. Prerequisites

### Required Sessions

* [x] `phase36-session01-current-audio-balance-audit` - Provides browser-path evidence for the current 45-cue baseline and identifies enemy/boss identity as the next high-value audio gap.
* [x] `phase36-session02-enemy-audio-metadata` - Provides typed metadata for enemy attack, hit, defeat, sentry telegraph/fire, and boss routing.
* [x] `phase35-session10-final-release-gate` - Provides the current AI Rogue Production Go posture, privacy boundaries, asset checks, and release gate constraints.

### Required Tools Or Knowledge

* Bun 1.3.14 project scripts.
* `ffmpeg` and `ffprobe` for Ogg Opus conversion and inspection.
* `ELEVENLABS_API_KEY` available only in local environment or `.env.local`.
* Existing SFX generation flow in `scripts/generate-ai-rogue-audio-sfx.ts`.
* Existing audio dispatch and metadata contracts in `src/extensions/ai-rogue/runtime/audio.ts` and `src/extensions/ai-rogue/runtime/types-simulation.ts`.

### Environment Requirements

* Run from the repository root.
* Generate raw masters only under `tmp/elevenlabs/ai-rogue-sfx/`.
* Commit only optimized runtime `.ogg` files under `src/assets/ai-rogue/audio/sfx/`.
* Keep generated session artifacts ASCII-only with Unix LF line endings.

***

## 4. Scope

### In Scope (MVP)

* AI Rogue players can hear distinct compact cues for small fast enemies, packet-thief grab/escape identity, corruption enemy crackle identity, sentry charge/fire, and Kernel Sentinel charge/fire/hit/shutdown.
* Runtime SFX files are generated as mono 48 kHz Ogg Opus assets and kept under the 200 KB non-music asset cap.
* `src/assets/ai-rogue/audio/sfx/provenance.json` records prompt, request, render, and source metadata for every new committed SFX file.
* Audio dispatch uses Session 02 metadata for family-specific cue selection while retaining explicit cue precedence, existing generic fallbacks, mute, volume, autoplay unlock, and silent no-op behavior.
* Focused tests prove cue mapping, metadata dispatch, stable fallback behavior, and unchanged combat behavior around sentry and boss events.

### Out Of Scope (Deferred)

* Theme music, theme ambience, or `snapshot.theme` routing - Reason: `phase36-session04-theme-audio-routing-contract` and Session 05 own that work.
* Adaptive stingers, second music bus, ducking, or transient music gain automation - Reason: Sessions 06 and 07 own adaptive-audio engine and asset decisions.
* Visual asset finishing or atlas work - Reason: Phase 37 owns visual asset finishing.
* Remote game-content loading or runtime downloads - Reason: Phase 36 requires committed local assets only.
* Weakening media policy to accept oversized SFX - Reason: generated SFX must stay within the existing 200 KB non-music asset cap.
* Changing simulation decisions, RNG, replay behavior, save data, or gameplay rules for audio selection - Reason: audio remains presentation-only.

***

## 5. Technical Approach

### Architecture

Extend the existing SFX pipeline rather than adding a new media workflow. Add the enemy/boss prompt entries to `scripts/generate-ai-rogue-audio-sfx.ts`, use its `tmp/elevenlabs/ai-rogue-sfx/raw/` scratch path, transcode through the existing `ffmpeg` loudnorm/Ogg Opus path, and merge the generated report into `src/assets/ai-rogue/audio/sfx/provenance.json`.

Add new cue IDs to `AiRogueSimulationAudioCueId` and map them in `SFX_FILE_BASENAMES`. Keep `dispatchEventSound()` priority intact: explicit `event.audioCues` still win, metadata-derived cues route next, and legacy message/type fallback remains only for metadata-free events. Because `resolveEnemyAttack()` currently emits explicit generic `enemy_melee` cues, update that event construction to emit family-specific explicit cues when the enemy kind is known and to retain shield cues when shields absorb or break.

Group enemy metadata conservatively. Small fast enemies are `signal-gnat`, `ping-mosquito`, `index-skink`, and `cache-wraith`; thief identity is `packet-thief`; corruption identity is `corrupt-newt`, `venom-daemon`, `burst-toad`, and `insight-beetle`; sentry identity is `firewall-sentry`; boss identity is `kernel-sentinel`. `errant-process` keeps existing generic enemy melee, hit, and defeated fallback behavior.

### Design Patterns

* Local asset pipeline: Generate scratch masters locally and commit only optimized runtime assets with provenance.
* Additive cue contract: New cue IDs extend the union without removing generic fallbacks used by older events and tests.
* Presentation-only dispatch: Metadata guides sound selection but never feeds back into simulation state or RNG.
* Explicit fallback boundaries: Missing assets, failed fetches, failed decodes, unknown metadata, and Web Audio unavailability remain silent-safe.

***

## 6. Deliverables

### Files To Create

| File                                                                                   | Purpose                                                             | Est. Lines |
| -------------------------------------------------------------------------------------- | ------------------------------------------------------------------- | ---------- |
| `src/assets/ai-rogue/audio/sfx/46_enemy_fast_attack.ogg`                               | Small fast enemy attack tick runtime cue.                           | binary     |
| `src/assets/ai-rogue/audio/sfx/47_enemy_fast_defeated.ogg`                             | Small fast enemy defeated runtime cue.                              | binary     |
| `src/assets/ai-rogue/audio/sfx/48_enemy_thief_grab.ogg`                                | Packet Thief grab or strike runtime cue.                            | binary     |
| `src/assets/ai-rogue/audio/sfx/49_enemy_thief_escape.ogg`                              | Packet Thief defeated or escape runtime cue.                        | binary     |
| `src/assets/ai-rogue/audio/sfx/50_enemy_corruption_attack.ogg`                         | Corruption-family attack crackle runtime cue.                       | binary     |
| `src/assets/ai-rogue/audio/sfx/51_enemy_corruption_defeated.ogg`                       | Corruption-family defeated collapse runtime cue.                    | binary     |
| `src/assets/ai-rogue/audio/sfx/52_enemy_sentry_charge.ogg`                             | Firewall Sentry telegraph or charge runtime cue.                    | binary     |
| `src/assets/ai-rogue/audio/sfx/53_enemy_sentry_fire.ogg`                               | Firewall Sentry line-fire runtime cue.                              | binary     |
| `src/assets/ai-rogue/audio/sfx/54_boss_charge.ogg`                                     | Kernel Sentinel charge or telegraph runtime cue.                    | binary     |
| `src/assets/ai-rogue/audio/sfx/55_boss_fire.ogg`                                       | Kernel Sentinel fire runtime cue.                                   | binary     |
| `src/assets/ai-rogue/audio/sfx/56_boss_hit.ogg`                                        | Kernel Sentinel hit reaction runtime cue.                           | binary     |
| `src/assets/ai-rogue/audio/sfx/57_boss_shutdown.ogg`                                   | Kernel Sentinel defeated shutdown runtime cue.                      | binary     |
| `.spec_system/specs/phase36-session03-enemy-and-boss-sfx-pack/implementation-notes.md` | Record generation, routing, test, asset-size, and browser evidence. | \~180      |

### Files To Modify

| File                                                                    | Changes                                                                                                            | Est. Lines |
| ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | ---------- |
| `scripts/generate-ai-rogue-audio-sfx.ts`                                | Add cue definitions and prompts for the enemy/boss SFX pack.                                                       | \~170      |
| `src/extensions/ai-rogue/runtime/types-simulation.ts`                   | Add new enemy-family and boss cue IDs.                                                                             | \~20       |
| `src/extensions/ai-rogue/runtime/audio.ts`                              | Add SFX basename mappings, metadata cue selection, and volume tuning for new cues.                                 | \~140      |
| `src/extensions/ai-rogue/runtime/combat.ts`                             | Emit family-specific explicit attack cues where explicit generic cues would otherwise bypass metadata dispatch.    | \~80       |
| `src/extensions/ai-rogue/runtime/__tests__/audio.test.ts`               | Cover metadata-derived family/boss cue dispatch, explicit precedence, missing-asset fallback, and stable variants. | \~160      |
| `src/extensions/ai-rogue/runtime/__tests__/combat.test.ts`              | Cover sentry, boss, fast, thief, and corruption cue emission through combat helpers.                               | \~120      |
| `src/assets/ai-rogue/audio/sfx/provenance.json`                         | Add prompt, request, render, and raw-response metadata for files 46-57.                                            | \~260      |
| `docs/extensions/ai-rogue/game-feel.md`                                 | Update current SFX count, enemy/boss identity behavior, and remaining audio caveats.                               | \~40       |
| `docs/extensions/ai-rogue/implementation-baseline.md`                   | Update durable asset inventory from 45 SFX to the new committed count and routing summary.                         | \~20       |
| `.spec_system/specs/phase36-session03-enemy-and-boss-sfx-pack/tasks.md` | Track implementation progress during the session.                                                                  | \~40       |

***

## 7. Success Criteria

### Functional Requirements

* [ ] New enemy-family and boss cues are committed as optimized Ogg Opus assets under `src/assets/ai-rogue/audio/sfx/`.
* [ ] Small fast, thief, corruption, sentry, and boss runtime events route to distinct attack, telegraph, hit, fire, defeated, or shutdown cues where metadata identifies the family.
* [ ] Generic enemy cues remain available for `errant-process`, old metadata-free events, unknown metadata, or missing new assets.
* [ ] New cues respect Web Audio unlock, mute, master/music/SFX volume, lazy decode, cached buffers, deterministic variant seeds, and silent fallback.
* [ ] The product outcome is proven end to end: new enemy and boss cues are fetched and played during real AI Rogue encounter paths in browser verification.

### Testing Requirements

* [ ] Focused audio tests written and passing.
* [ ] Focused combat tests written and passing.
* [ ] Asset-size validation passes.
* [ ] Runtime browser verification scenarios completed.

### Non-Functional Requirements

* [ ] Every new SFX file is under 200 KB.
* [ ] Provenance covers every committed new file.
* [ ] Raw MP3 API responses remain in gitignored scratch space.
* [ ] No remote game-content loading, hosted writes, collectors, analytics, public-demo bridge calls, or new third-party runtime dependency is added.
* [ ] Audio selection does not affect simulation decisions, RNG consumption, replay behavior, save contracts, or browser-local state.

### Quality Gates

* [ ] All files ASCII-encoded
* [ ] Unix LF line endings
* [ ] Code follows project conventions
* [ ] Primary user-facing surfaces contain product-facing copy only

***

## 8. Implementation Notes

### Working Assumptions

* The existing ElevenLabs SFX generator is the correct media pipeline for this session. Evidence: `scripts/generate-ai-rogue-audio-sfx.ts` already writes raw MP3 responses to `tmp/elevenlabs/ai-rogue-sfx/raw/`, transcodes mono 48 kHz Ogg Opus assets with `ffmpeg`, and records render metadata that matches `docs/media-policy.md`. Planning can proceed because no new toolchain or remote runtime path is needed.
* Session 02 metadata is sufficient for this session without changing simulation state. Evidence: `AiRogueSimulationAudioMetadata` now carries `actorKind`, `targetKind`, `enemyKind`, `targetEnemyKind`, and `intent`, and focused Session 02 tests passed. Planning can proceed by mapping those fields to new cue IDs in the presentation layer.
* `errant-process` should keep the existing generic enemy cues in this session. Evidence: the Phase 36 stub names small fast enemies, thief, corruption enemies, sentry, and boss as the identity opportunities. Keeping the default enemy on the baseline cues limits asset count and leaves generic fallback behavior easy to verify.
* The new pack should add 12 SFX files numbered 46-57. Evidence: the existing committed SFX files are numbered 01-45 and provenance indexes match that inventory. Continuing the sequence keeps asset discovery and docs legible.

### Key Considerations

* `resolveEnemyAttack()` currently emits explicit `enemy_melee` or shield cues; explicit cue precedence means Session 03 must update those explicit attack cues instead of relying only on metadata fallback.
* Asset generation can fail because of missing credentials or provider errors. Preserve all planned code and provenance structure, and record any generation blocker in implementation notes if it occurs during `implement`.
* Browser instrumentation can prove fetch/decode/playback paths, but final acoustic quality still needs human listening in Session 08.

### Potential Challenges

* Generated cues may be too loud, quiet, long, or fatiguing: keep prompts short, inspect render metadata, run asset-size checks, and document browser review findings.
* Family mapping can accidentally overfit broad enemy groups: keep the mapping explicit and tested by enemy kind.
* New cue IDs can break exhaustive TypeScript records: update the cue union, SFX basename map, tests, and provenance together.

### Relevant Considerations

* \[P34-P35] **AI Rogue is production default-enabled**: Preserve Production Go posture and the explicit `none` disable path.
* \[P31-P35] **Public-demo and AI Rogue gates stay bundled**: Keep asset-size, no-remote, privacy, no-bridge, browser, Pages, and playthrough checks bundled for release validation.
* \[P30/P32/P34-P35] **Route-lazy runtime ownership scales**: Keep audio runtime behavior behind the Play route/local facade and narrow runtime boundary.
* \[P30/P34-P35] **Visibility gates catch real issues**: Pair focused audio tests with browser checks, build/budget, private-runtime, no-remote, Pages, playthrough, and D3 scans before widening behavior.
* \[P30/P32/P34-P35] **Do not widen AI Rogue capabilities without review**: Audio finishing must not add collectors, WebGPU-only requirements, workers, remote loading, hosted writes, analytics, or expanded content without fresh review.

### Behavioral Quality Focus

Checklist active: Yes

Top behavioral risks for this session:

* Asset-load and decode failures must remain silent-safe and must not throw in gameplay.
* Metadata-bearing events must not double-trigger both family-specific and legacy fallback cues.
* High-frequency combat cues must stay short, compact, and low-fatigue in repeated encounters.

***

## 9. Testing Strategy

### Unit Tests

* Extend `src/extensions/ai-rogue/runtime/__tests__/audio.test.ts` to verify metadata routing for fast, thief, corruption, sentry, and boss cue IDs, explicit cue precedence, missing/unknown metadata no-op behavior, and stable variant URL selection.
* Extend `src/extensions/ai-rogue/runtime/__tests__/combat.test.ts` to verify family-specific explicit cue emission where combat helpers have concrete enemy kinds.

### Integration Tests

* Run focused AI Rogue tests: `bun run test -- src/extensions/ai-rogue/runtime/__tests__/audio.test.ts src/extensions/ai-rogue/runtime/__tests__/combat.test.ts src/extensions/ai-rogue/runtime/__tests__/protocols.test.ts`
* Run `bun run typecheck` after cue union, mapping, and docs-facing contract updates.

### Runtime Verification

* Run `bash scripts/check-asset-sizes.sh` after committing generated Ogg files.
* Use browser verification to start AI Rogue, unlock audio, trigger or simulate enemy fast/thief/corruption/sentry/boss encounter paths, and confirm the new SFX URLs are fetched without page errors.
* Verify mute, SFX volume, and Web Audio unavailable fallback still recover safely after the new cues are mapped.

### Edge Cases

* Missing new asset URL for a cue ID should not throw.
* Metadata for unknown or generic enemy kind should fall back to existing generic cue behavior or no-op safely.
* Shield absorb/break events should preserve shield cues rather than masking them with family attack cues.
* Boss telegraph/fire/hit/shutdown should not consume simulation RNG or affect replay-equivalent run outcomes.

***

## 10. Dependencies

### Other Sessions

* Depends on: `phase36-session01-current-audio-balance-audit`, `phase36-session02-enemy-audio-metadata`
* Depended by: `phase36-session08-final-audio-validation-and-docs`

***

## Next Steps

Run the `implement` workflow step to begin implementation.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/.spec_system/archive/sessions/phase36-session03-enemy-and-boss-sfx-pack/spec.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
