Guides

Voice Cloning

Create custom AI voices trained on your own audio samples for a branded, personalized voice agent experience.

| View as Markdown
Hunter Hodnett
Hunter Hodnett CPTO at Chipp
| 1 min read
# voice # cloning # elevenlabs # tutorials

Create a custom AI voice that sounds like you (or anyone you have permission to clone). Voice cloning uses ElevenLabs Instant Voice Cloning to synthesize voices from short audio samples — no training phase, ready in seconds.

⚠️

Voice cloning requires a Studio plan or higher.

How It Works

  1. Record or upload a short audio sample (1-2 minutes)
  2. ElevenLabs analyzes vocal characteristics (pitch, timbre, prosody, accent)
  3. A custom voice is generated instantly
  4. Select it as your voice agent’s voice

Cloned voices are shared across all apps in your organization.

Creating a Custom Voice

1

Open Voice Settings

Go to your app’s Build page and open the Voice card. Voice mode must be enabled.

2

Add Custom Voice

Find the Custom Voices section and click Add Custom Voice.

3

Record or Upload Audio

Choose one method:

Browser recording (recommended):

  • Click Record and speak naturally for 1-2 minutes
  • Re-record if needed

File upload:

  • Upload a pre-recorded audio file
  • Supported formats: WAV, MP3, M4A, OGG, WebM, FLAC
  • Maximum file size: 50MB

4

Name and Create

Give your voice a memorable name (e.g., “CEO Rachel” or “Support Persona”). Click Create Voice. The voice is available immediately.

5

Select the Voice

In the voice selection dropdown, your custom voices appear at the top. Select it to use with your voice agent.

Recording Tips for Best Quality

The quality of your clone depends entirely on the quality of your audio sample.

Duration

  • Optimal: 1-2 minutes
  • Too short (<30 seconds): May lack vocal variety
  • Too long (>5 minutes): Can introduce instability
  • The AI captures voice characteristics best from concise, focused samples

Environment

  • Record in a quiet room with soft furnishings (curtains, carpets reduce echo)
  • Turn off fans, air conditioning, and notifications
  • Close windows to block outside noise
  • The AI replicates everything it hears — background noise becomes part of the voice

Microphone Technique

  • Position the microphone about 20cm away (two fists distance)
  • Speak slightly off-axis to reduce plosive sounds (hard P’s and B’s)
  • Use a pop filter if available
  • Avoid breathing directly into the mic

Audio Quality

  • Peak levels: -6dB to -3dB (loud parts don’t clip)
  • Avoid clipping/distortion at all costs — the AI can’t recover from it
  • Standard sample rate (44.1kHz or 48kHz) works well

Delivery

  • Maintain consistent tone and energy throughout
  • Don’t switch between animated and subdued delivery
  • Read with natural pacing — not robotic, not theatrical
  • Use your own writing or scripts for natural rhythm

Managing Custom Voices

  • Delete a voice: Remove it from your organization’s voice library at any time. This also removes it from ElevenLabs.
  • Multiple voices: Create as many custom voices as your plan allows
  • Cross-app usage: All apps in your organization can use any custom voice
  • Audio privacy: Your recording is stored encrypted and never shared publicly

Troubleshooting

Voice doesn’t sound right?

  • Re-record with better audio quality (less background noise, no clipping)
  • Try a longer sample (aim for 1-2 minutes of natural speech)
  • Ensure consistent tone throughout the recording

Voice not appearing in selection?

  • Check that the creation completed successfully
  • Refresh the voice settings page
  • Verify you’re on a Studio plan or higher

Upload failing?

  • Check file size (max 50MB)
  • Verify the file format (WAV, MP3, M4A, OGG, WebM, FLAC)
  • Try a different format or re-export the audio
ℹ️

For more about voice agent configuration, see the Voice Agents guide.