US Patent 11854562 High-quality non-parallel many-to-many voice conversion

A method (and structure and computer product) to permit zero-shot voice conversion with non-parallel data includes receiving source speaker speech data as input data into a content encoder of a style transfer autoencoder system, the content encoder providing a source speaker disentanglement of the source speaker speech data by reducing speaker style information of the input source speech data while retaining content information and receiving target speaker input speech as input data into a target speaker encoder. The output of the content encoder and the target speaker encoder are combined in a decoder of the style transfer autoencoder, and the output of the decoder provides the content information of the input source speech data in a style of the target speaker speech information.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11854562 High-quality non-parallel many-to-many voice conversion

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11854562 High-quality non-parallel many-to-many voice conversion