Patent attributes
A method includes performing, by a computing device, a clustering operation to group documents of a document corpus into clusters in a feature vector space. The document corpus includes one or more labeled documents and one or more unlabeled documents. Each of the one or more labeled documents is assigned to a corresponding class in classification data associated with the document corpus, and each of the one or more unlabeled document is not assigned to any class in the classification data. The method also includes generating, by the computing device, a prompt requesting classification of a particular document of the document corpus, where the particular document is selected based on a distance between the particular document and a labeled document of the one or more labeled documents.