Patent attributes
Systems and methods are described for generating record clusters. The methods comprise receiving a plurality of records from data sources and providing at least a subset of the records to a scoring model that determines scores for various pairings of the records, a score for a given pair of the records representing a probability that the given pair of records contain data elements about the same entity. The method further comprises generating a graph data structure that includes a plurality of nodes, individual nodes representing a different record from the records. The method also comprises assigning a different unique identifier to individual clusters of the final clusters and responding to a request for data regarding a given entity by providing aggregated data elements from those records of the records associated with a cluster of the final clusters having an identifier that represents the given entity.