Patent attributes
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing word sequences using neural networks. One of the methods includes receiving a first sequence of words arranged according to a first order; and for each word in the first sequence, beginning with a first word in the first order: determining a topic vector that is associated with the word; generating a combined input from the word and the topic vector, and processing the combined input through one or more sequence modeling layers to generate a sequence modeling output for the word; and processing one or more of the sequence modeling outputs through an output layer to generate a neural network output for the first sequence of words.