Patent attributes
There is provided a 2D document extractor for extracting entities from a structured document, the 2D document extractor includes a first convolutional neural network (CNN), a second CNN, and a third recurrent neural network (RNN). A plurality of text sequences and structural elements indicative of location of the text sequences in the document are received. The first CNN encodes the text sequences and structural elements to obtain a 3D encoded image indicative of semantic characteristics of the text sequences and having the structure of the document. The second CNN compresses the 3D encoded image to obtain a feature vector, the feature vector being indicative of a combination of spatial characteristics and semantic characteristics of the 3D encoded image. The third RNN decodes the feature vector to extract the text entities, a given text entity being associated with a text sequence.