> For the complete documentation index, see [llms.txt](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/elevenlabs/sound-effects-api.md).

# ElevenLabs Sound Effects API

**API Key for this project can be found in `.env.local` as `ELEVENLABS_API_KEY`**

Last checked against official ElevenLabs docs: 2026-06-23.

This document describes the ElevenLabs Text to Sound Effects API as a local asset-generation engine for AI OS and extension work. Keep API keys in server-side scripts only; never expose `ELEVENLABS_API_KEY` through a `VITE_` variable or browser code.

## Official References

* [Sound effects overview](https://elevenlabs.io/docs/overview/capabilities/sound-effects)
* [Create sound effect API reference](https://elevenlabs.io/docs/api-reference/text-to-sound-effects/convert)
* [API authentication](https://elevenlabs.io/docs/api-reference/authentication)
* [API pricing](https://elevenlabs.io/pricing/api)

## Engine Summary

ElevenLabs Sound Effects converts text descriptions into generated audio effects for use cases such as game one-shots, Foley, ambience, UI feedback, trailers, and video post-production.

The API engine is request-response:

1. Send a JSON prompt and generation options to the sound-generation endpoint.
2. Receive audio bytes in the selected output format.
3. Save, trim, normalize, review, and commit only final compressed assets that comply with `docs/media-policy.md`.

For AI OS, treat the service as an offline or script-side generation dependency. Generated assets may be committed after review, but the API key and raw private working files must remain local.

## Endpoint

```
POST https://api.elevenlabs.io/v1/sound-generation
```

Required headers:

```
xi-api-key: $ELEVENLABS_API_KEY
Content-Type: application/json
```

Optional query parameters:

| Name            | Type   | Notes                                                                  |
| --------------- | ------ | ---------------------------------------------------------------------- |
| `output_format` | string | Audio format string such as `mp3_44100_128`; omit for the API default. |

`output_format` values are formatted as `codec_sample_rate_bitrate`. The official API reference lists the current enum values and plan restrictions. Use the default MP3 format unless a runtime specifically needs PCM, WAV, or a telephony format.

## Request Body

| Field              | Required | Default                   | Notes                                                                                          |
| ------------------ | -------- | ------------------------- | ---------------------------------------------------------------------------------------------- |
| `text`             | Yes      | N/A                       | Natural-language description of the requested sound effect.                                    |
| `duration_seconds` | No       | `null`                    | API reference range is 0.5 to 30 seconds. If omitted or `null`, ElevenLabs chooses a duration. |
| `prompt_influence` | No       | `0.3`                     | Range is 0 to 1. Higher values follow the prompt more strictly and reduce variation.           |
| `loop`             | No       | `false`                   | Requests a smooth loop. Available for `eleven_text_to_sound_v2`.                               |
| `model_id`         | No       | `eleven_text_to_sound_v2` | Sound-generation model ID. Keep explicit in scripts for reproducibility.                       |

The overview page and API reference currently disagree on the lower bound for explicit duration. The API reference is the integration contract, so scripts should enforce 0.5 to 30 seconds unless the API reference changes.

## Response

The endpoint returns generated audio bytes. Save the response with a file extension that matches the selected `output_format`.

The API reference documents a `character-cost` response header. The general API introduction also recommends reading response headers such as `request-id` and `x-trace-id` when debugging generation cost or support issues.

## cURL Example

```bash
curl -X POST "https://api.elevenlabs.io/v1/sound-generation?output_format=mp3_44100_128" \
  -H "xi-api-key: ${ELEVENLABS_API_KEY}" \
  -H "Content-Type: application/json" \
  --output strike.mp3 \
  -d '{
    "text": "Short metallic sword strike, bright transient, dry mix, no voice, no music, under one second",
    "duration_seconds": 1.0,
    "prompt_influence": 0.45,
    "model_id": "eleven_text_to_sound_v2"
  }'
```

## Bun Fetch Example

```ts
const apiKey = process.env.ELEVENLABS_API_KEY;

if (!apiKey) {
  throw new Error("Missing ELEVENLABS_API_KEY");
}

const response = await fetch(
  "https://api.elevenlabs.io/v1/sound-generation?output_format=mp3_44100_128",
  {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "xi-api-key": apiKey,
    },
    body: JSON.stringify({
      text: "Retro UI confirm click with a small glassy sparkle, no voice",
      duration_seconds: 0.7,
      prompt_influence: 0.4,
      model_id: "eleven_text_to_sound_v2",
    }),
  },
);

if (!response.ok) {
  throw new Error(
    `ElevenLabs sound generation failed: ${response.status} ${await response.text()}`,
  );
}

const audio = new Uint8Array(await response.arrayBuffer());
await Bun.write("confirm-click.mp3", audio);
```

## Prompting Guidance

Good prompts describe the sound as an audio asset, not just as an event. Include:

* Source: material, object, creature, interface, weather, weapon, or machine.
* Action: impact, scrape, spark, pulse, whoosh, rumble, loop, rise, or decay.
* Space: dry, close mic, cave reverb, outdoor distance, muffled, or filtered.
* Mix intent: no voice, no music, no melody, short tail, mono-compatible, or background ambience.
* Duration intent: one-shot, under one second, two-second pickup, or 30-second loop.

For cohesive game packs, keep shared vocabulary across prompts. Example: `dry cyberpunk arcade mix`, `short tail`, `no voice`, and `no music` can appear in every UI one-shot prompt so the generated set feels related.

## Parameter Tuning

* Start with `prompt_influence: 0.3` to `0.45`.
* Increase `prompt_influence` when the model adds unwanted elements or misses a specific material/action.
* Decrease `prompt_influence` when generating variants from the same prompt.
* Set `duration_seconds` for one-shots and gameplay cues that need timing discipline.
* Omit `duration_seconds` for exploratory ideation.
* Use `loop: true` for ambience and background texture, then verify the loop in the runtime mixer because generated loops can still need trimming.

## Local Asset Workflow

1. Generate into a local scratch directory that is not committed.
2. Review multiple takes before selecting a final asset.
3. Trim silence, remove clicks, normalize loudness, and export the committed format.
4. Keep final committed files within the asset budget in `docs/media-policy.md`.
5. Record provenance next to any committed asset set: prompt, model ID, duration, output format, generation date, and manual edits.
6. Do not commit the ElevenLabs API key, raw API responses that exceed asset policy, or private prompt material.

## Error Handling

* `422` means the request body or query parameters failed validation.
* `403` can occur when an Enterprise IP whitelist rejects the request.
* Authentication failures usually mean the key is missing, disabled, scoped away from the endpoint, or pasted into the wrong header.
* Treat rate-limit and transient server failures as retryable with backoff.
* Log `request-id` and `x-trace-id` response headers when available, but do not log the API key or full private prompts.

## Security Notes

* Use `ELEVENLABS_API_KEY` from `.env.local`; keep `.env.local.example` as a short placeholder only.
* Do not create a client-exposed `VITE_ELEVENLABS_API_KEY`.
* Avoid sending secrets, private user data, or unreleased product copy in sound prompts.
* Prefer original, descriptive prompts. Do not ask for a third-party song, branded sonic logo, or recognizable copyrighted audio signature.

## Billing Notes

Pricing and credit rules can change. The official overview notes that explicit duration affects credit usage, and the pricing page lists Sound Effects in the API pricing table. Do not hard-code cost assumptions into scripts; read current pricing before bulk generation and monitor usage from the ElevenLabs dashboard.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://ai-os-and-trend-finder.gitbook.io/ai-os-and-trend-finder-docs/docs/elevenlabs/sound-effects-api.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
