Customize pronunciations using Amazon Polly

海外精选
海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时,内容中提到的“AWS” 是 “Amazon Web Services” 的缩写,在此网站不作为商标展示。
0
0
{"value":"\n\n[Amazon Polly](https://aws.amazon.com/polly/) breathes life into text by converting it into lifelike speech. This empowers developers and businesses to create applications that can converse in real time, thereby offering an enhanced interactive experience. Text-to-speech (TTS) in Amazon Polly supports a variety of [languages](https://docs.aws.amazon.com/polly/latest/dg/SupportedLanguage.html) and locales, which enables you to perform TTS conversion according to your preferences. Multiple factors guide this choice, such as geographic location and language locales.\n\nAmazon Polly uses advanced deep learning technologies to synthesize text to speech in real time in various output formats, such as MP3, ogg vorbis, JSON, or PCM, across standard and [neural](https://docs.aws.amazon.com/polly/latest/dg/NTTS-main.html#ntts-engine) engines. The Speech Synthesis Markup Language ([SSML](https://docs.aws.amazon.com/polly/latest/dg/ssml.html)) support for Amazon Polly further bolsters the service’s capability to customize speech with a plethora of options, including controlling speech rate and volume, adding pauses, emphasizing certain words or phrases, and more.\n\nIn today’s world, businesses continue to expand across multiple geographic locations, and they’re continuously looking for mechanisms to improve personalized end-user engagement. For instance, you may require accurate pronunciation of certain words in a specific style pertaining to different geographical locations. Your business may also need to pronounce certain words and phrases in certain ways depending on their intended meaning. You can achieve this with the help of [SSML tags](https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html) provided by Amazon Polly.\n\nThis post aims to assist you in customizing pronunciation when dealing with a truly global customer base.\n\n#### **Modify pronunciation using phonemes**\n\nA phoneme can be considered as the smallest unit of speech. The ```<phoneme>``` SSML tag in Amazon Polly helps customize pronunciation based on phonemes using the IPA (International Phonetic Alphabets) or X-SAMPA (Extended Speech Assessment Methods Phonetic Alphabet). X-SAMPA is a representation of IPA in ASCII encoding. Phoneme tags are available and fully supported in both the standard and neural TTS engine. For example, the word “lead” can be pronounced as the present tense verb, or it can refer to the chemical element lead. We will discuss this with an example further in this blog post.\n\n##### **International Phonetic Alphabet**\nThe IPA is used to portray sounds across different languages. For a list of phonemes Amazon Polly supports, refer to [Phoneme and Viseme Tables for Supported Languages](https://docs.aws.amazon.com/polly/latest/dg/ref-phoneme-tables-shell.html).\n\nBy default, Amazon Polly determines the pronunciation of the word in a specific format. Let’s use the example of the word “lead,” which can have different pronunciations when referring to the chemical element or the verb. In this example, when we provide the word “lead” as input, it’s spoken in the present tense form (without the use of any customizing SSML tags). The default pronunciation for ```L E A D``` by Amazon Polly is the present tense form of “lead.”\n\n```\n<speak>\nThe default pronunciation by Amazon Polly for L E A D is <break time = \"300ms\"/> lead,\nwhich is the present tense form.\n</speak>\n```\n\nTo return the pronunciation of the chemical element lead (which can also be the verb in past tense), we can use phonemes along with IPA or X-SAMPA. IPA is generally used to customize the pronunciation of a word in a given language using phonemes:\n\n```\n<speak>\nThis is the pronunciation using the\n<say-as interpret-as=\"characters\">IPA</say-as> attribute\nin the <say-as interpret-as=\"characters\">SSML</say-as> tag. \nThe verb form for L E A D is <break time=\"150ms\"/> lead.\nThe chemical element <break time=\"150ms\"/><phoneme alphabet=\"ipa\" ph=\"lɛd\">lead</phoneme> \n<break time=\"300ms\"/>also has an identical spelling.\n</speak>\n```\n\n#### **Modify pronunciation by specifying parts of speech**\n\nIf we consider the same example of pronouncing “lead,” we can also differentiate between the chemical element and the verb by specifying the parts of speech using the [<w>](https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html#w-tag) SSML tag.\n\nThe ```<w>``` tag allows us to customize pronunciation by specifying parts of speech. You can configure the pronunciation in terms of verb (present simple or past tense), noun, adjective, preposition, and determiner. See the following example:\n\n```\n<speak>\nThe word<p> <say-as interpret-as=\"characters\">lead</say-as></p> \nmay be interpreted as either the present simple form <w role=\"amazon:VB\">lead</w>, \nor the chemical element <w role=\"amazon:SENSE_1\">lead</w>.\n</speak>\n```\n\nAdditionally, you can use the [sub](https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html#sub-tag) tag to indicate the pronunciation of acronyms and abbreviations:\n\n```\n<speak>\nPolly is an <sub alias=\"Amazon Web Services\">AWS</sub> \noffering providing text-to-Speech service. \n</speak>\n```\n\n#### **Extended Speech Assessment Methods Phonetic Alphabet**\n\nThe [X-SAMPA](https://en.wikipedia.org/wiki/X-SAMPA) transcription scheme is an extrapolation to the various language-specific SAMPA phoneme sets available.\n\nThe following snippet shows how you can use X-SAMPA to pronounce different variations of the word “lead”:\n\n```\n<speak>\nThis is the pronunciation using the X-SAMPA attribute, \nin the verb form <break time=\"1s\"/> lead.\nThe chemical element <break time=\"1s\"/> \n<phoneme alphabet='x-sampa' ph='lEd'>lead</phoneme> <break time=\"0.5s\"/>\nalso has an identical spelling.\n</speak>\n```\n\nThe stress mark in IPA is usually represented by ˈ. We often encounter scenarios in which an [apostrophe](https://unicodemap.org/details/0x0027/index.html) is used instead, which might give a different output than expected. In X-SAMPA, the stress mark is the [double quotation mark](https://unicodemap.org/details/0x0022/index.html), therefore we should use a single quotation mark for the word and specify the phonemic alphabet. See the following example:\n\n```\n<speak>\nYou say, <phoneme alphabet=\"ipa\" ph=\"pɪˈkɑːn\">pecan</phoneme>. \n</speak>\n```\n\nIn the example above, we can see the character ˈ used for stressing the word. Similarly, the stress mark in X-SAMPA is shown in double quotation below:\n\n```\n<speak>\nYou say, <phoneme alphabet='x-sampa' ph='pI\"kA:n'>pecan</phoneme>.\n</speak>\n```\n\n#### **Modify pronunciations using other SSML tags**\n\nYou can use the ```<say as>``` tag to modify pronunciation by enabling the spell-out or character feature. Furthermore, it enhances pronunciations in terms of digits, fractions, unit, date, time, address, telephone, cardinal, and ordinal, and can also censor the text enclosed within the tag. For more information, refer to [Controlling How Special Types of Words Are Spoken](https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html#say-as-tag). Let’s look at examples of these attributes.\n\n#### **Date**\n\nBy default, Amazon Polly speaks out different text inputs. However, for handling specific attributes such as dates, you can use the ```date ```attribute to customize pronunciation in the required format, such as month-day-year or day-month-year.\n\nWithout the ```date```attribute, Amazon Polly provides the following output when speaking out dates:\n\n```\n<speak>\nThe default pronunciation when using date is 01-11-1996\n</speak>\n```\n\nHowever, if you want the dates spoken in a specific format, the date attribute in the <say-as> tags helps customize the pronunciation:\n\n```\n<speak>\nWe will see the examples of different date formats using the date SSML tag.\nThe following date is written in the day-month-year format.\n<say-as interpret-as=\"date\" format=\"dmy\">01-11-1995</say-as><break time=\"500ms\"/>\nThe following date is written in the month-day-year format.\n<say-as interpret-as=\"date\" format=\"mdy\">09-24-1995</say-as>\n</speak>\n```\n\n#### **Cardinal**\n\nThis attribute represents a number in its cardinal format. For example, 124456 is pronounced “one hundred twenty four thousand four hundred fifty six”:\n\n```\n<speak> \nThe following number is pronounced in it's cardinal form.\n<say-as interpret-as=\"cardinal\">124456</say-as>\n</speak>\n```\n\n#### **Ordinal**\n\nThis attribute represents a number in its ```ordinal```format. Without the ordinal attribute, the number is pronounced in its numerical form:\n\n```\n<speak>\nThe following number is pronounced in it's ordinal form \nwithout the use of any SSML attribute in the say as tag - 1242 \n</speak>\n```\n\nIf we want to pronounce 1242 as “one thousand two hundred forty second,” we can use the ```ordinal```attribute:\n\n```\n<speak>\nThe following number is pronounced in it's ordinal form.\n<say-as interpret-as=\"ordinal\">1242</say-as>\n</speak>\n```\n\n#### **Digits**\n\nThe ```digits ```attribute is used to speak out the numbers. For example, “1234” is pronounced as “one two three four”:\n\n```\n<speak>\nThe following number is pronounced as individual digits.\n<say-as interpret-as=\"digits\">1242</say-as>\n</speak>\n```\n\n#### **Fraction**\n\nThe ```fraction ```\nattribute is used to customize the pronunciations in the fractional form:\n\n```\n<speak> \nThe following are examples of pronunciations when \n<prosody volume=\"loud\"> fraction</prosody>\nis used as an attribute in the say -as tag. \n<break time=\"500ms\"/>Seven one by two is pronounced as\n<say-as interpret-as=\"fraction\">7 ½ </say-as>\nwhereas three by twenty is pronounced as <say-as interpret-as=\"fraction\">3/20</say-as>\n</speak>\n```\n\n#### **Time**\n\nThe ```time```attribute is used to measure the time across minutes and seconds:\n\n```\n<speak>\nPolly also supports customizing pronunciation in terms of minutes and seconds. \nFor example, <say-as interpret-as=\"time\">2'42\"</say-as>\n</speak>\n```\n\n#### **Expletive**\nThe ```expletive ```attribute censors the text enclosed within the tags:\n\n```\n<speak> \nThe value that is going to be censored is\n<say-as interpret-as=\"expletive\">this is not good</say-as>\nYou should have heard the beep sound.\n</speak>\n```\n\n#### **Telephone**\nTo pronounce telephone numbers, you can use the ```telephone ```attribute to speak out telephone numbers instead of pronouncing them as standalone digits or as a cardinal number:\n\n```\n<speak>\nThe telephone number is \n<say-as interpret-as=\"telephone\">1800 3000 9009</say-as>\n</speak>\n```\n\n#### **Address**\nThe ```address ```attribute is used to customize the pronunciation of an address aligning to a specific format:\n\n```\n<speak> \nThe address is<break time=\"1s\"/>\n<say-as interpret-as=\"address\">440 Terry Avenue North, Seattle\nWA 98109 USA</say-as>\n</speak>\n```\n\n#### **Lexicons**\nWe’ve looked at some of the SSML tags readily available in Amazon Polly. Other use cases might require a higher degree of control for customized pronunciations. Lexicons help achieve this requirement. You can use lexicons when certain words need to be pronounced in a certain form that is uncommon to that specific language.\n\nAnother use case for lexicons is with the use of numeronyms, which are abbreviations formed with the help of numbers. For example, Y2K is pronounced as the “year 2000.” You can use lexicons to customize these pronunciations.\n\nAmazon Polly supports lexicon files in .pls and .xml formats. For more information, see [Managing Lexicons](https://docs.aws.amazon.com/polly/latest/dg/managing-lexicons.html).\n\n#### **Conclusion**\nAmazon Polly SSML tags can help you customize pronunciation in a variety of ways. We hope that this post gives you a head start into the world of speech synthesis and powers your applications to provide more lifelike human interactions.\n\n#### **About the Authors**\n\n![image.png](https://dev-media.amazoncloud.cn/0d66e529dc544645b686c77fa7b49095_image.png)\n\n**Abilashkumar P C** is a Cloud Support Engineer at AWS. He works with customers providing technical troubleshooting guidance, helping them achieve their workloads at scale. Outside of work, he loves driving, following cricket, and reading.\n\n![image.png](https://dev-media.amazoncloud.cn/f14c3a8bafcc495cbdd4b112fb2f7e62_image.png)\n\n**Abhishek Soni** is a Partner Solutions Architect at AWS. He works with customers to provide technical guidance for the best outcome of workloads on AWS.\n\n\n\n\n\n\n\n\n\n","render":"<p><a href=\"https://aws.amazon.com/polly/\" target=\"_blank\">Amazon Polly</a> breathes life into text by converting it into lifelike speech. This empowers developers and businesses to create applications that can converse in real time, thereby offering an enhanced interactive experience. Text-to-speech (TTS) in Amazon Polly supports a variety of <a href=\"https://docs.aws.amazon.com/polly/latest/dg/SupportedLanguage.html\" target=\"_blank\">languages</a> and locales, which enables you to perform TTS conversion according to your preferences. Multiple factors guide this choice, such as geographic location and language locales.</p>\n<p>Amazon Polly uses advanced deep learning technologies to synthesize text to speech in real time in various output formats, such as MP3, ogg vorbis, JSON, or PCM, across standard and <a href=\"https://docs.aws.amazon.com/polly/latest/dg/NTTS-main.html#ntts-engine\" target=\"_blank\">neural</a> engines. The Speech Synthesis Markup Language (<a href=\"https://docs.aws.amazon.com/polly/latest/dg/ssml.html\" target=\"_blank\">SSML</a>) support for Amazon Polly further bolsters the service’s capability to customize speech with a plethora of options, including controlling speech rate and volume, adding pauses, emphasizing certain words or phrases, and more.</p>\n<p>In today’s world, businesses continue to expand across multiple geographic locations, and they’re continuously looking for mechanisms to improve personalized end-user engagement. For instance, you may require accurate pronunciation of certain words in a specific style pertaining to different geographical locations. Your business may also need to pronounce certain words and phrases in certain ways depending on their intended meaning. You can achieve this with the help of <a href=\"https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html\" target=\"_blank\">SSML tags</a> provided by Amazon Polly.</p>\n<p>This post aims to assist you in customizing pronunciation when dealing with a truly global customer base.</p>\n<h4><a id=\"Modify_pronunciation_using_phonemes_10\"></a><strong>Modify pronunciation using phonemes</strong></h4>\n<p>A phoneme can be considered as the smallest unit of speech. The <code>&lt;phoneme&gt;</code> SSML tag in Amazon Polly helps customize pronunciation based on phonemes using the IPA (International Phonetic Alphabets) or X-SAMPA (Extended Speech Assessment Methods Phonetic Alphabet). X-SAMPA is a representation of IPA in ASCII encoding. Phoneme tags are available and fully supported in both the standard and neural TTS engine. For example, the word “lead” can be pronounced as the present tense verb, or it can refer to the chemical element lead. We will discuss this with an example further in this blog post.</p>\n<h5><a id=\"International_Phonetic_Alphabet_14\"></a><strong>International Phonetic Alphabet</strong></h5>\n<p>The IPA is used to portray sounds across different languages. For a list of phonemes Amazon Polly supports, refer to <a href=\"https://docs.aws.amazon.com/polly/latest/dg/ref-phoneme-tables-shell.html\" target=\"_blank\">Phoneme and Viseme Tables for Supported Languages</a>.</p>\n<p>By default, Amazon Polly determines the pronunciation of the word in a specific format. Let’s use the example of the word “lead,” which can have different pronunciations when referring to the chemical element or the verb. In this example, when we provide the word “lead” as input, it’s spoken in the present tense form (without the use of any customizing SSML tags). The default pronunciation for <code>L E A D</code> by Amazon Polly is the present tense form of “lead.”</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nThe default pronunciation by Amazon Polly for L E A D is &lt;break time = &quot;300ms&quot;/&gt; lead,\nwhich is the present tense form.\n&lt;/speak&gt;\n</code></pre>\n<p>To return the pronunciation of the chemical element lead (which can also be the verb in past tense), we can use phonemes along with IPA or X-SAMPA. IPA is generally used to customize the pronunciation of a word in a given language using phonemes:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nThis is the pronunciation using the\n&lt;say-as interpret-as=&quot;characters&quot;&gt;IPA&lt;/say-as&gt; attribute\nin the &lt;say-as interpret-as=&quot;characters&quot;&gt;SSML&lt;/say-as&gt; tag. \nThe verb form for L E A D is &lt;break time=&quot;150ms&quot;/&gt; lead.\nThe chemical element &lt;break time=&quot;150ms&quot;/&gt;&lt;phoneme alphabet=&quot;ipa&quot; ph=&quot;lɛd&quot;&gt;lead&lt;/phoneme&gt; \n&lt;break time=&quot;300ms&quot;/&gt;also has an identical spelling.\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Modify_pronunciation_by_specifying_parts_of_speech_39\"></a><strong>Modify pronunciation by specifying parts of speech</strong></h4>\n<p>If we consider the same example of pronouncing “lead,” we can also differentiate between the chemical element and the verb by specifying the parts of speech using the <a href=\"https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html#w-tag\" target=\"_blank\">&lt;w&gt;</a> SSML tag.</p>\n<p>The <code>&lt;w&gt;</code> tag allows us to customize pronunciation by specifying parts of speech. You can configure the pronunciation in terms of verb (present simple or past tense), noun, adjective, preposition, and determiner. See the following example:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nThe word&lt;p&gt; &lt;say-as interpret-as=&quot;characters&quot;&gt;lead&lt;/say-as&gt;&lt;/p&gt; \nmay be interpreted as either the present simple form &lt;w role=&quot;amazon:VB&quot;&gt;lead&lt;/w&gt;, \nor the chemical element &lt;w role=&quot;amazon:SENSE_1&quot;&gt;lead&lt;/w&gt;.\n&lt;/speak&gt;\n</code></pre>\n<p>Additionally, you can use the <a href=\"https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html#sub-tag\" target=\"_blank\">sub</a> tag to indicate the pronunciation of acronyms and abbreviations:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nPolly is an &lt;sub alias=&quot;Amazon Web Services&quot;&gt;AWS&lt;/sub&gt; \noffering providing text-to-Speech service. \n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Extended_Speech_Assessment_Methods_Phonetic_Alphabet_62\"></a><strong>Extended Speech Assessment Methods Phonetic Alphabet</strong></h4>\n<p>The <a href=\"https://en.wikipedia.org/wiki/X-SAMPA\" target=\"_blank\">X-SAMPA</a> transcription scheme is an extrapolation to the various language-specific SAMPA phoneme sets available.</p>\n<p>The following snippet shows how you can use X-SAMPA to pronounce different variations of the word “lead”:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nThis is the pronunciation using the X-SAMPA attribute, \nin the verb form &lt;break time=&quot;1s&quot;/&gt; lead.\nThe chemical element &lt;break time=&quot;1s&quot;/&gt; \n&lt;phoneme alphabet='x-sampa' ph='lEd'&gt;lead&lt;/phoneme&gt; &lt;break time=&quot;0.5s&quot;/&gt;\nalso has an identical spelling.\n&lt;/speak&gt;\n</code></pre>\n<p>The stress mark in IPA is usually represented by ˈ. We often encounter scenarios in which an <a href=\"https://unicodemap.org/details/0x0027/index.html\" target=\"_blank\">apostrophe</a> is used instead, which might give a different output than expected. In X-SAMPA, the stress mark is the <a href=\"https://unicodemap.org/details/0x0022/index.html\" target=\"_blank\">double quotation mark</a>, therefore we should use a single quotation mark for the word and specify the phonemic alphabet. See the following example:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nYou say, &lt;phoneme alphabet=&quot;ipa&quot; ph=&quot;pɪˈkɑːn&quot;&gt;pecan&lt;/phoneme&gt;. \n&lt;/speak&gt;\n</code></pre>\n<p>In the example above, we can see the character ˈ used for stressing the word. Similarly, the stress mark in X-SAMPA is shown in double quotation below:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nYou say, &lt;phoneme alphabet='x-sampa' ph='pI&quot;kA:n'&gt;pecan&lt;/phoneme&gt;.\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Modify_pronunciations_using_other_SSML_tags_94\"></a><strong>Modify pronunciations using other SSML tags</strong></h4>\n<p>You can use the <code>&lt;say as&gt;</code> tag to modify pronunciation by enabling the spell-out or character feature. Furthermore, it enhances pronunciations in terms of digits, fractions, unit, date, time, address, telephone, cardinal, and ordinal, and can also censor the text enclosed within the tag. For more information, refer to <a href=\"https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html#say-as-tag\" target=\"_blank\">Controlling How Special Types of Words Are Spoken</a>. Let’s look at examples of these attributes.</p>\n<h4><a id=\"Date_98\"></a><strong>Date</strong></h4>\n<p>By default, Amazon Polly speaks out different text inputs. However, for handling specific attributes such as dates, you can use the <code>date </code>attribute to customize pronunciation in the required format, such as month-day-year or day-month-year.</p>\n<p>Without the <code>date</code>attribute, Amazon Polly provides the following output when speaking out dates:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nThe default pronunciation when using date is 01-11-1996\n&lt;/speak&gt;\n</code></pre>\n<p>However, if you want the dates spoken in a specific format, the date attribute in the &lt;say-as&gt; tags helps customize the pronunciation:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nWe will see the examples of different date formats using the date SSML tag.\nThe following date is written in the day-month-year format.\n&lt;say-as interpret-as=&quot;date&quot; format=&quot;dmy&quot;&gt;01-11-1995&lt;/say-as&gt;&lt;break time=&quot;500ms&quot;/&gt;\nThe following date is written in the month-day-year format.\n&lt;say-as interpret-as=&quot;date&quot; format=&quot;mdy&quot;&gt;09-24-1995&lt;/say-as&gt;\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Cardinal_122\"></a><strong>Cardinal</strong></h4>\n<p>This attribute represents a number in its cardinal format. For example, 124456 is pronounced “one hundred twenty four thousand four hundred fifty six”:</p>\n<pre><code class=\"lang-\">&lt;speak&gt; \nThe following number is pronounced in it's cardinal form.\n&lt;say-as interpret-as=&quot;cardinal&quot;&gt;124456&lt;/say-as&gt;\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Ordinal_133\"></a><strong>Ordinal</strong></h4>\n<p>This attribute represents a number in its <code>ordinal</code>format. Without the ordinal attribute, the number is pronounced in its numerical form:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nThe following number is pronounced in it's ordinal form \nwithout the use of any SSML attribute in the say as tag - 1242 \n&lt;/speak&gt;\n</code></pre>\n<p>If we want to pronounce 1242 as “one thousand two hundred forty second,” we can use the <code>ordinal</code>attribute:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nThe following number is pronounced in it's ordinal form.\n&lt;say-as interpret-as=&quot;ordinal&quot;&gt;1242&lt;/say-as&gt;\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Digits_153\"></a><strong>Digits</strong></h4>\n<p>The <code>digits </code>attribute is used to speak out the numbers. For example, “1234” is pronounced as “one two three four”:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nThe following number is pronounced as individual digits.\n&lt;say-as interpret-as=&quot;digits&quot;&gt;1242&lt;/say-as&gt;\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Fraction_164\"></a><strong>Fraction</strong></h4>\n<p>The <code>fraction </code><br />\nattribute is used to customize the pronunciations in the fractional form:</p>\n<pre><code class=\"lang-\">&lt;speak&gt; \nThe following are examples of pronunciations when \n&lt;prosody volume=&quot;loud&quot;&gt; fraction&lt;/prosody&gt;\nis used as an attribute in the say -as tag. \n&lt;break time=&quot;500ms&quot;/&gt;Seven one by two is pronounced as\n&lt;say-as interpret-as=&quot;fraction&quot;&gt;7 ½ &lt;/say-as&gt;\nwhereas three by twenty is pronounced as &lt;say-as interpret-as=&quot;fraction&quot;&gt;3/20&lt;/say-as&gt;\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Time_180\"></a><strong>Time</strong></h4>\n<p>The <code>time</code>attribute is used to measure the time across minutes and seconds:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nPolly also supports customizing pronunciation in terms of minutes and seconds. \nFor example, &lt;say-as interpret-as=&quot;time&quot;&gt;2'42&quot;&lt;/say-as&gt;\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Expletive_191\"></a><strong>Expletive</strong></h4>\n<p>The <code>expletive </code>attribute censors the text enclosed within the tags:</p>\n<pre><code class=\"lang-\">&lt;speak&gt; \nThe value that is going to be censored is\n&lt;say-as interpret-as=&quot;expletive&quot;&gt;this is not good&lt;/say-as&gt;\nYou should have heard the beep sound.\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Telephone_202\"></a><strong>Telephone</strong></h4>\n<p>To pronounce telephone numbers, you can use the <code>telephone </code>attribute to speak out telephone numbers instead of pronouncing them as standalone digits or as a cardinal number:</p>\n<pre><code class=\"lang-\">&lt;speak&gt;\nThe telephone number is \n&lt;say-as interpret-as=&quot;telephone&quot;&gt;1800 3000 9009&lt;/say-as&gt;\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Address_212\"></a><strong>Address</strong></h4>\n<p>The <code>address </code>attribute is used to customize the pronunciation of an address aligning to a specific format:</p>\n<pre><code class=\"lang-\">&lt;speak&gt; \nThe address is&lt;break time=&quot;1s&quot;/&gt;\n&lt;say-as interpret-as=&quot;address&quot;&gt;440 Terry Avenue North, Seattle\nWA 98109 USA&lt;/say-as&gt;\n&lt;/speak&gt;\n</code></pre>\n<h4><a id=\"Lexicons_223\"></a><strong>Lexicons</strong></h4>\n<p>We’ve looked at some of the SSML tags readily available in Amazon Polly. Other use cases might require a higher degree of control for customized pronunciations. Lexicons help achieve this requirement. You can use lexicons when certain words need to be pronounced in a certain form that is uncommon to that specific language.</p>\n<p>Another use case for lexicons is with the use of numeronyms, which are abbreviations formed with the help of numbers. For example, Y2K is pronounced as the “year 2000.” You can use lexicons to customize these pronunciations.</p>\n<p>Amazon Polly supports lexicon files in .pls and .xml formats. For more information, see <a href=\"https://docs.aws.amazon.com/polly/latest/dg/managing-lexicons.html\" target=\"_blank\">Managing Lexicons</a>.</p>\n<h4><a id=\"Conclusion_230\"></a><strong>Conclusion</strong></h4>\n<p>Amazon Polly SSML tags can help you customize pronunciation in a variety of ways. We hope that this post gives you a head start into the world of speech synthesis and powers your applications to provide more lifelike human interactions.</p>\n<h4><a id=\"About_the_Authors_233\"></a><strong>About the Authors</strong></h4>\n<p><img src=\"https://dev-media.amazoncloud.cn/0d66e529dc544645b686c77fa7b49095_image.png\" alt=\"image.png\" /></p>\n<p><strong>Abilashkumar P C</strong> is a Cloud Support Engineer at AWS. He works with customers providing technical troubleshooting guidance, helping them achieve their workloads at scale. Outside of work, he loves driving, following cricket, and reading.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/f14c3a8bafcc495cbdd4b112fb2f7e62_image.png\" alt=\"image.png\" /></p>\n<p><strong>Abhishek Soni</strong> is a Partner Solutions Architect at AWS. He works with customers to provide technical guidance for the best outcome of workloads on AWS.</p>\n"}
目录
亚马逊云科技解决方案 基于行业客户应用场景及技术领域的解决方案
联系亚马逊云科技专家
亚马逊云科技解决方案
基于行业客户应用场景及技术领域的解决方案
联系专家
0
目录
关闭
contact-us