Patent attributes
Embodiments are directed to representing documents using document keys. Documents that include one or more clauses may be provided. Each clause type for the one or more clauses in documents may be determined based on one or more classification models. One or more clause identifiers may be associated with the one or more clauses based on one or more clause types of each clause. A document key may be generated for each document based on an ordered collection of the one or more clauses included in each document such that each clause identifier may be positioned in the document key based on an order of its location in a corresponding clause of a document. The documents may be analyzed based on evaluations of one or more document keys corresponding to the documents. One or more reports may be generated based on one or more results of the analysis.