Skip to main content
POST
/
v1
/
generate_speech
cURL
curl --request POST \
  --url https://api.contextlm.ai/v1/generate_speech \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "text": "<string>",
  "voice_id": "<string>",
  "pitch": 0,
  "speaking_rate": 1,
  "output_format": "LINEAR16"
}
'
{
  "audiobytes": "<string>"
}

Authorizations

X-API-Key
string
header
required

Body

application/json

Speech synthesis parameters

text
string
required

The text input to be converted to speech

voice_id
string
required

The voice id to use in speech synthesis

pitch
number
default:0

Speaking pitch, in the range [-20.0, 20.0]. 20 means increase 20 semitones from the original pitch. -20 means decrease 20 semitones from the original pitch.

Required range: -20 <= x <= 20
speaking_rate
number
default:1

Speaking rate/speed, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. Any other values < 0.25 or > 4.0 will return an error.

Required range: 0.25 <= x <= 4
output_format
enum<string>
default:LINEAR16

The format of the audio byte stream. LINEAR16 a.k.a WAV is the best for audio quality.

Available options:
LINEAR16,
MP3,
OGG_OPUS,
MULAW

Response

Successful response

audiobytes
file