US Patent 10319365 Text-to-speech processing with emphasized output audio

Systems and methods for generating output audio with emphasized portions are described. Spoken audio is obtained and undergoes speech processing (e.g., ASR and optionally NLU) to create text. It may be determined that the resulting text includes a portion that should be emphasized (e.g., an interjection) using at least one of knowledge of an application run on a device that captured the spoken audio, prosodic analysis, and/or linguistic analysis. The portion of text to be emphasized may be tagged (e.g., using a Speech Synthesis Markup Language (SSML) tag). TTS processing is then performed on the tagged text to create output audio including an emphasized portion corresponding to the tagged portion of the text.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 10319365 Text-to-speech processing with emphasized output audio

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 10319365 Text-to-speech processing with emphasized output audio