Patent attributes
Systems and methods are described for natural language processing of a text sequence. The system can identify a set of text and location information for the set of text in an image. The set of text may correspond to an input sequence space. The system can project embeddings of the text into a latent space for processing. Further, the system can reproject the processed embeddings from the latent space to the input sequence space. The system may perform multiple stages of projecting the embeddings to the latent space and reprojecting the processed embeddings from the latent space to the input sequence space. The system can route the reprojected embeddings to a neural network that can identify class predictions for elements of the set of text.