Patent attributes
Systems and methods are provided for generating a set of classifiers. A term is identified within a document and a pre-defined threshold distance is determined. A plurality of additional terms in the document are identified, the additional terms being located within the pre-defined threshold distance of the time. A distance between the term and an additional term of the plurality of additional terms is calculated. A corresponding weight for the calculated distance is determined using a proximity weighting scheme. A score for the additional term is calculated using the calculated distance and the corresponding weight. A colocation matrix is generated and a classifier determined using the colocation matrix.