Patent attributes
Systems and methods include extraction of a plurality of clauses from each of a plurality of electronic documents, determination, for each of the plurality of clauses and using a machine-learned algorithm, an associated clause type, identification of one or more data privacy protection entities present within each of one or more of the plurality of clauses, determination, for each of the one or more of the plurality of clauses, of a weighted frequency for each of the one or more data privacy protection entities present within the clause based on a type of the data privacy protection entity, determination of a weighted frequency associated with each of the plurality of electronic documents based on the determined weighted frequency for each of the one or more data privacy protection entities present within clauses of the plurality of electronic documents, and storage of an identifier of each of the plurality of electronic documents in association with a respective determined weighted frequency.