Patent attributes
A handwritten text processing system processes a digitized document including handwritten text input to generate an output version of the digitized document that allows users to execute text processing functions on the textual content of the digitized document. Each word of the digitized data is extracted by converting the digitized document into images, binarizing the images, and segmenting the images into binary image patches. Each binary image patch is further processed to identify if the word is machine-generated or if the word is handwritten. The output version is generated by combining underlying images of the pages of the digitized document with words from the pages superimposed in a transparent font at positions that coincide with the positions of the words in the underlying images.