US Patent 11481605 2D document extractor

There is provided a 2D document extractor for extracting entities from a structured document, the 2D document extractor includes a first convolutional neural network (CNN), a second CNN, and a third recurrent neural network (RNN). A plurality of text sequences and structural elements indicative of location of the text sequences in the document are received. The first CNN encodes the text sequences and structural elements to obtain a 3D encoded image indicative of semantic characteristics of the text sequences and having the structure of the document. The second CNN compresses the 3D encoded image to obtain a feature vector, the feature vector being indicative of a combination of spatial characteristics and semantic characteristics of the 3D encoded image. The third RNN decodes the feature vector to extract the text entities, a given text entity being associated with a text sequence.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11481605 2D document extractor

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11481605 2D document extractor