Patent attributes
A speech synthesis method, a speech synthesis device, and an electronic apparatus are provided, which relate to a field of speech synthesis. Specific implementation solution is the following: inputting text information into an encoder of an acoustic model, to output a text feature of a current time step; splicing the text feature of the current time step with a spectral feature of a previous time step to obtain a spliced feature of the current time step, and inputting the spliced feature of the current time step into an decoder of the acoustic model to obtain a spectral feature of the current time step; and inputting the spectral feature of the current time step into a neural network vocoder, to output speech.