US Patent 11335333 Speech recognition with sequence-to-sequence models

A method includes obtaining audio data for a long-form utterance and segmenting the audio data for the long-form utterance into a plurality of overlapping segments. The method also includes, for each overlapping segment of the plurality of overlapping segments: providing features indicative of acoustic characteristics of the long-form utterance represented by the corresponding overlapping segment as input to an encoder neural network; processing an output of the encoder neural network using an attender neural network to generate a context vector; and generating word elements using the context vector and a decoder neural network. The method also includes generating a transcription for the long-form utterance by merging the word elements from the plurality of overlapping segments and providing the transcription as an output of the automated speech recognition system.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11335333 Speech recognition with sequence-to-sequence models

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11335333 Speech recognition with sequence-to-sequence models