Patent attributes
An apparatus, computer-readable medium, and computer-implemented method for postal address identification, including receiving one or more sequences of tokens corresponding to candidate postal address data objects, evaluating the sequences of tokens with the statistical postal address model to identify candidate postal address data objects, computing candidate vectors corresponding to the identified candidate postal address data objects in the vector space, and determining whether the identified candidate postal address data objects correspond to a postal address based on applying outlier detection methods to the candidate vectors and one or more clusters of the clusters.