AI Prompt for Music & Audio Generation
An ElevenLabs TTS prompt for a cheerful Indian-English female, 20s delivering a whispered conspiratorial performance over 3 seconds for YouTube background bed.
More prompts for Music & Audio Generation.
A ElevenLabs Sound Effects prompt for an original 5 seconds track in the ambient drone genre with a angry and aggressive mood and ethereal female choir.
A ElevenLabs Sound Effects prompt for an original 3 minutes 30 seconds track in the deep house genre with a melancholic and introspective mood and male baritone lead.
A ElevenLabs Sound Effects prompt for an original 5 seconds track in the indie folk genre with a dreamy and nostalgic mood and rap verses.
A layered audio production prompt for a 30 seconds meditative and calm jazz fusion ambient bed designed to sit under Australian laid-back male surfer narration for podcast intro.
A layered audio production prompt for a 3 minutes 30 seconds triumphant hero swell trap ambient bed designed to sit under gritty British male, 50s narration for podcast intro.
A ElevenLabs Sound Effects prompt for an original 5 seconds track in the lo-fi hip-hop genre with a dreamy and nostalgic mood and spoken word poetic.
You are writing a text-to-speech script optimized for ElevenLabs (v2 / v2.5 / Multilingual v2 models). The output will be run through ElevenLabs to produce a voice performance. ## Brief - **Voice profile:** cheerful Indian-English female, 20s - **Emotion:** whispered conspiratorial - **Duration target:** 3 seconds - **Use case:** YouTube background bed ## ElevenLabs Prompt Engineering Rules ### 1. Use punctuation as a conductor ElevenLabs reads punctuation almost literally: - **Period** = long beat - **Comma** = short beat - **Ellipsis (...)** = trailing, thoughtful pause (use sparingly — overuse causes drift) - **Em-dash (—)** = sharp mid-sentence pivot - **Question mark** = rising intonation (works on v2 and later) - **Exclamation** = raised volume, but v2 is subtle — pair with word choice ### 2. Emotion tags (v2.5+ "emotion control") ElevenLabs does NOT honor inline stage directions like `[laughing]` in the text itself — it will read them aloud. Instead: - Adjust the **Stability slider** (lower = more emotional variance). - Adjust the **Style slider** (higher = more expressive, more drift risk). - Write the *emotion into the words themselves*. A whispered conspiratorial delivery comes from word choice and sentence rhythm, not bracket tags. ### 3. Phonetic hints for hard words If the script contains a proper noun or technical term the model gets wrong: - Spell it phonetically in parentheses the first time: "SQLite (sequel-lite)". - Or break it with hyphens: "Anthropic" → "An-thropic". ### 4. Pacing markers Use line breaks to insert real pauses. Two blank lines between sentences creates a dramatic beat. This is the cleanest control. ### 5. SSML (only on API) If the user is using the ElevenLabs API (not the web app): - `<break time="1s"/>` inserts a precise pause - `<emphasis level="strong">word</emphasis>` can work but is inconsistent - SSML is stripped in the web UI — do not include unless API-only ## Voice Matching For a **cheerful Indian-English female, 20s**, recommend: - **Model:** [Eleven v2.5 for English, Multilingual v2 for non-English accents] - **Stability:** [suggest a value 0-100 — lower = more emotional] - **Similarity boost:** [suggest 70-90 for cloned voices] - **Style exaggeration:** [0-30 for natural; 40-70 for dramatic whispered conspiratorial] - **Speaker boost:** on ## Output Format **A. Production Script** A 3 seconds script in plain text. No bracket tags. Use punctuation and line breaks as pacing. **B. Voice Settings Panel** ``` Voice: cheerful Indian-English female, 20s match (suggest 2 ElevenLabs library voices that fit, OR describe for cloning) Model: [v2.5 / Multilingual v2] Stability: [0-100] Similarity boost: [0-100] Style: [0-100] Speaker boost: ON ``` **C. Delivery Notes** 3-5 bullet points the user should keep in mind when reviewing the output (common artifacts: mispronounced proper nouns, over-emphasis on long sentences, uncanny breaths on whispered whispered conspiratorial). ## Strict Rules - Do not put stage directions inside the script body. - Do not use all caps for emphasis — ElevenLabs will interpret as shouting randomly. - Do not use more than two ellipses per paragraph. - For a YouTube background bed, keep sentences under 25 words for clean delivery. - Write the script at reading-level appropriate to the cheerful Indian-English female, 20s (a gritty 50s British male ≠ cheerful 20s Indian-English female in vocabulary). Now produce the full output for a whispered conspiratorial cheerful Indian-English female, 20s delivering a YouTube background bed of 3 seconds.
Replace the bracketed placeholders with your own context before running the prompt:
[laughing]— fill in your specific laughing.[suggest a value 0-100 — lower = more emotional]— fill in your specific suggest a value 0-100 — lower = more emotional.[suggest 70-90 for cloned voices]— fill in your specific suggest 70-90 for cloned voices.[v2.5 / Multilingual v2]— fill in your specific v2.5 / multilingual v2.[0-100]— fill in your specific 0-100.