Patent attributes
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio processing using neural networks. One of the systems includes multiple neural network layers, wherein the neural network system is configured to receive time domain features of an audio sample and to process the time domain features to generate a neural network output for the audio sample, the plurality of neural network layers comprising: a frequency-transform (F-T) layer that is configured to apply a transformation defined by a set of F-T layer parameters that transforms a window of time domain features into frequency domain features; and one or more other neural network layers having respective layer parameters, wherein the one or more neural network layers are configured to process frequency domain features to generate a neural network output.