Other attributes
WaveNet is a deep neural network designed to generate raw audio waveforms. It generates realistic-sounding voices for Google Assistant globally.
It mimics the human voice and sounds more natural than the best existing Text-to-Speech systems, reducing the gap with the human performance by over 50% and creating higher quality audio.
DeepMind's WaveNet is a type of feedforward neural network, convolutional neural network (CNN). It is composed of layers of interconnected nodes, CNN uses a raw signal as input and synthesizes an output. The trained network creates new speech-like waveforms at 16,000 samples per second. The output waveforms include realistic breaths and lip smacks.
It was created by researchers at DeepMind in London in 2016. Other Text-to-speech systems (TTSs) are Apple's Siri, Microsoft’s Cortana, Amazon Alexa among others.