Patent attributes
Provided is a process including: obtaining a corpus having a plurality of document collections, each of which is associated with features; for a given document collection, computing a pertinence score for each feature; ranking the features based on the features' pertinence scores; selecting a first set of features based on a first coverage score thereof and a threshold; re-ranking the first set of features based on the features' relevance to the document collection; and selecting a second set of features from the first set of features based on a second coverage score thereof and the threshold, the second set of features being used for summarizing the document collection.