US Patent 11551668 Generating representations of speech signals using self-supervised learning

In one embodiment, a method includes generating audio segments from a speech signal, generating latent representations that respectively correspond to the audio segments, the latent representations comprising a first subset and a second subset, generating quantized representations that respectively correspond to the latent representations, masking the second subset of the latent representations, using a machine-learning model to process the first subset of the latent representations and the masked second subset of the latent representations to generate contextualized representations that respectively correspond to the latent representations, pre-training the machine-learning model based on comparisons between (1) a subset of the contextualized representations that respectively correspond to the masked second subset of the latent representations and (2) a subset of the quantized representations that respectively correspond to the masked second subset of the latent representations, and training the pre-trained machine-learning model to perform a speech analysis task.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11551668 Generating representations of speech signals using self-supervised learning

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11551668 Generating representations of speech signals using self-supervised learning