Voice Cloning | Chipp Docs

Create a custom AI voice that sounds like you (or anyone you have permission to clone). Voice cloning uses ElevenLabs Instant Voice Cloning to synthesize voices from short audio samples — no training phase, ready in seconds.

⚠️

Voice cloning requires a Studio plan or higher.

How It Works

Record or upload a short audio sample (1-2 minutes)
ElevenLabs analyzes vocal characteristics (pitch, timbre, prosody, accent)
A custom voice is generated instantly
Select it as your voice agent’s voice

Cloned voices are shared across all apps in your organization.

Creating a Custom Voice

1

Open Voice Settings

Go to your app’s Build page and open the Voice card. Voice mode must be enabled.

2

Add Custom Voice

Find the Custom Voices section and click Add Custom Voice.

3

Record or Upload Audio

Choose one method:

Browser recording (recommended):

Click Record and speak naturally for 1-2 minutes
Re-record if needed

File upload:

Upload a pre-recorded audio file
Supported formats: WAV, MP3, M4A, OGG, WebM, FLAC
Maximum file size: 50MB

4

Name and Create

Give your voice a memorable name (e.g., “CEO Rachel” or “Support Persona”). Click Create Voice. The voice is available immediately.

5

Select the Voice

In the voice selection dropdown, your custom voices appear at the top. Select it to use with your voice agent.

Recording Tips for Best Quality

The quality of your clone depends entirely on the quality of your audio sample.

Duration

Optimal: 1-2 minutes
Too short (<30 seconds): May lack vocal variety
Too long (>5 minutes): Can introduce instability
The AI captures voice characteristics best from concise, focused samples

Environment

Record in a quiet room with soft furnishings (curtains, carpets reduce echo)
Turn off fans, air conditioning, and notifications
Close windows to block outside noise
The AI replicates everything it hears — background noise becomes part of the voice

Microphone Technique

Position the microphone about 20cm away (two fists distance)
Speak slightly off-axis to reduce plosive sounds (hard P’s and B’s)
Use a pop filter if available
Avoid breathing directly into the mic

Audio Quality

Peak levels: -6dB to -3dB (loud parts don’t clip)
Avoid clipping/distortion at all costs — the AI can’t recover from it
Standard sample rate (44.1kHz or 48kHz) works well

Delivery

Maintain consistent tone and energy throughout
Don’t switch between animated and subdued delivery
Read with natural pacing — not robotic, not theatrical
Use your own writing or scripts for natural rhythm

Managing Custom Voices

Delete a voice: Remove it from your organization’s voice library at any time. This also removes it from ElevenLabs.
Multiple voices: Create as many custom voices as your plan allows
Cross-app usage: All apps in your organization can use any custom voice
Audio privacy: Your recording is stored encrypted and never shared publicly

Troubleshooting

Voice doesn’t sound right?

Re-record with better audio quality (less background noise, no clipping)
Try a longer sample (aim for 1-2 minutes of natural speech)
Ensure consistent tone throughout the recording

Voice not appearing in selection?

Check that the creation completed successfully
Refresh the voice settings page
Verify you’re on a Studio plan or higher

Upload failing?

Check file size (max 50MB)
Verify the file format (WAV, MP3, M4A, OGG, WebM, FLAC)
Try a different format or re-export the audio

ℹ️

For more about voice agent configuration, see the Voice Agents guide.