US Patent 11676625 Unified endpointer using multitask and multidomain learning

A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11676625 Unified endpointer using multitask and multidomain learning

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11676625 Unified endpointer using multitask and multidomain learning