Patent 11675766 was granted and assigned to Amazon on June, 2023 by the United States Patent and Trademark Office.
A hierarchical representation of an input data set comprising similarity scores for respective entity pairs is generated iteratively. In a particular iteration, clusters are obtained from a subset of the iteration's input entity pairs which satisfy a similarity criterion, and then spanning trees are generated for at least some of the clusters. An indication of at least a representative pair of one or more of the clusters is added to the hierarchical representation in the iteration. The hierarchical representation is used to respond to clustering requests.