Before you start creating SSML content, it's good to be familiar with the different elements that are available in Conversational AI Cloud's SSML Editor.
Speak Tag
<speak>
This is the root element of the response. Add this and enter the content of what you want to say in between the speak elements. This tag is required to use in your answer. Without it, validation will fail.
Break Tag
<break>
Represents a pause in the speech. Set the length of the pause with the strength or time attributes.
Sub Tag
<sub>
Pronounce the specified word or phrase as a different word or phrase with an alias. This example replaces the abbreviated chemical elements with the full words:
P Tag
<p>
This element represents a paragraph and provides extra strong breaks before and after the tag.
Sentence Tag
<sentence>
This element provides strong breaks before and after the tag. It works the same as ending a sentence with a period or specifying a pause with the element.
Say-as Tag
<say-as>
Direct the behavior of your speech with this element by selecting an attribute. Depending on the attribute selected, you can provide additional context to the text to indicate how the text should be interpreted.
Interpret as:
Characters – Spell out each letter
Cardinal – Interpret the value as a cardinal number
Ordinal – Interpret the value as an ordinal number
Digits – Spell each digit separately
Fraction – Interpret the value as a fraction
Unit – Interpret the value as a measurement
Date – Interpret the value as a date
Time – Interpret the value as duration in minutes and seconds (i.e. 1’30’’)
Telephone – Interpret a value as a 7 or 10-digit telephone number. This can also handle extensions.
Address – Interpret a value as part of a street address
Interjection – Use this to speak the text in a more expressive voice. For instance ‘Great!’ or ‘Perfect!’. This is language specific.
Pause – Use this to add a pause
Expletive – comes out as a beep as if the text is censored
Prosody Tag
<prosody>
Customize the rate, pitch, and volume of the text.
Rate - Select how slow or fast you want rate to be. Choose between:
x-slow
slow
medium
fast
x-fast
Pitch - Select how low or high you want the pitch to be. Choose between:
x-low
low
medium
high
x-high
Volume - Select how soft or loud you want the volume to be. Choose between:
Silent
x-soft
soft
medium
loud
x-loud
Audio Tag
<audio>
Add an MP3 audio file to your answer. Use this to embed short, prerecorded audio within your response.
Emphasis Tag
<emphasis>
Control the emphasis level for the text. This changes rate and volume of the speech. More emphasis is spoken louder and slower. Less emphasis is quieter and faster.
Choose between:
Moderate - Increase the volume and slow down the speaking rate, but not as much as when set to strong. This is used as a default if level is not provided.
Strong - Increase the volume and slow down the speaking rate so the speech is louder and slower.
Reduced - Decrease the volume and speed up the speaking rate. The speech is softer and faster.
Learn more about SSML elements on Google Cloud or Amazon Alexa.