Patent attributes
An application classifier classifies applications using latent semantic indexing (LSI) vectors of the applications. The application classifier uses a machine-learned model generated based on pairs of LSI vectors of positive and negative training sets of applications, where the positive training set includes applications within a desired category and the negative training set includes applications outside of the desired category. For a given application, the application classifier determines whether the application belongs to the desired category based on similarity of an LSI vector of the application and LSI vectors of positive and negative exemplar applications, as determined by the machine-learned model. If the LSI vector of the application is similar to an LSI vector of at least one positive exemplar application and not similar to an LSI vector of any of the negative exemplar applications, the application is determined to belong to the desired category.