US Patent 10249314 Voice conversion system and method with variance and spectrum compensation

A voice conversion system for generating realistic, natural-sounding target speech is disclosed. The voice conversion system preferably comprises a neural network for converting the source speech data to estimated target speech data; a global variance correction module; a modulation spectrum correction module; and a waveform generator. The global variance correction module is configured to scale and shift (or normalize and de-normalize) the estimated target speech based on (i) a mean and standard deviation of the source speech data, and further based on (ii) a mean and standard deviation of the estimated target speech data. The modulation spectrum correction module is configured to apply a plurality of filters to the estimated target speech data after it has been scaled and shifted by the global variance correction module. Each filter is designed to correct the trajectory representing the curve of one MCEP coefficient over time. Collectively, the plurality of filters are designed to correct the trajectories of each of the MCEP coefficients in the target voice data being generated from the source speech data. Once the MCEP coefficients are corrected, they are then provided to a waveform generator configured to generate the target voice signal that can then be played to the user via a speaker.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 10249314 Voice conversion system and method with variance and spectrum compensation

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 10249314 Voice conversion system and method with variance and spectrum compensation