Systems and methods for categorizing patterns of characters in a document by utilizing machine based learning techniques include generating character classification training data, building a character classification model based on the character classification training data; obtaining an image that includes a pattern of characters, the characters including one or more contours, applying the character classification model to the image to classify the contours, and applying the labels to clusters of the contours.