US Patent 11978435 Long-context end-to-end speech recognition system

Patent 11978435 was granted and assigned to Mitsubishi Electric Research Laboratories on May, 2024 by the United States Patent and Trademark Office.

Overview Structured Data Issues Contributors Activity

All edits

Edits on 8 May, 2024

"Created via: Patent importer"

Golden AI

created this topic on 8 May, 2024

Edits made to:

Infobox (+15 properties)

Article (+917 characters)

‌

US Patent 11978435 Long-context end-to-end speech recognition system

Article

Patent abstract

This invention relates generally to speech processing and more particularly to end-to-end automatic speech recognition (ASR) that utilizes long contextual information. Some embodiments of the invention provide a system and a method for end-to-end ASR suitable for recognizing long audio recordings such as lecture and conversational speeches. This disclosure includes a Transformer-based ASR system that utilizes contextual information, wherein the Transformer accepts multiple utterances at the same time and predicts transcript for the last utterance. This is repeated in a sliding-window fashion with one-utterance shifts to recognize the entire recording. In addition, some embodiments of the present invention may use acoustic and/or text features obtained from only the previous utterances spoken by the same speaker as the last utterance when the long audio recording includes multiple speakers.

Infobox

Is a