Log in
Enquire now
‌

US Patent 11295721 Generating expressive speech audio from text data

OverviewStructured DataIssuesContributors

Contents

Is a
Patent
Patent
0

Patent attributes

Patent Jurisdiction
United States Patent and Trademark Office
United States Patent and Trademark Office
0
Patent Number
112957210
Patent Inventor Names
Jervis Pinto0
Dhaval Shah0
Zahra Shakeri0
Siddharth Gururani0
Navid Aghdaie0
Mohsen Sardari0
Kilol Gupta0
Kazi Zaman0
Date of Patent
April 5, 2022
0
Patent Application Number
168400700
Date Filed
April 3, 2020
0
Patent Citations
‌
US Patent 10431188 Organization of personalized content
‌
US Patent 10692484 Text-to-speech (TTS) processing
‌
US Patent 10699695 Text-to-speech (TTS) processing
‌
US Patent 10706837 Text-to-speech (TTS) processing
‌
US Patent 10741169 Text-to-speech (TTS) processing
0
‌
US Patent 10911596 Voice user interface for wired communications system
‌
US Patent 10902841 Personalized custom synthetic speech
‌
US Patent 11069335 Speech synthesis using one or more recurrent neural networks
...
Patent Citations Received
‌
US Patent 11617952 Emotion based music style change using deep learning
0
Patent Primary Examiner
‌
Khai N. Nguyen
0
CPC Code
‌
G10L 13/027
0
‌
G10L 13/00
0
‌
G06N 3/08
0
‌
G06N 3/0445
0
‌
A63F 2300/6018
0
‌
A63F 13/63
0
‌
A63F 13/60
0

A system for use in video game development to generate expressive speech audio comprises a user interface configured to receive user-input text data and a user selection of a speech style. The system includes a machine-learned synthesizer comprising a text encoder, a speech style encoder and a decoder. The machine-learned synthesizer is configured to generate one or more text encodings derived from the user-input text data, using the text encoder of the machine-learned synthesizer; generate a speech style encoding by processing a set of speech style features associated with the selected speech style using the speech style encoder of the machine-learned synthesizer; combine the one or more text encodings and the speech style encoding to generate one or more combined encodings; and decode the one or more combined encodings with the decoder of the machine-learned synthesizer to generate predicted acoustic features. The system includes one or more modules configured to process the predicted acoustic features, the one or more modules comprising a machine-learned vocoder configured to generate a waveform of the expressive speech audio.

Timeline

No Timeline data yet.

Further Resources

Title
Author
Link
Type
Date
No Further Resources data yet.

References

Find more entities like US Patent 11295721 Generating expressive speech audio from text data

Use the Golden Query Tool to find similar entities by any field in the Knowledge Graph, including industry, location, and more.
Open Query Tool
Access by API
Golden Query Tool
Golden logo

Company

  • Home
  • Press & Media
  • Blog
  • Careers
  • WE'RE HIRING

Products

  • Knowledge Graph
  • Query Tool
  • Data Requests
  • Knowledge Storage
  • API
  • Pricing
  • Enterprise
  • ChatGPT Plugin

Legal

  • Terms of Service
  • Enterprise Terms of Service
  • Privacy Policy

Help

  • Help center
  • API Documentation
  • Contact Us