Patent attributes
A computer system generates a similarity-optimized hierarchy for hierarchical data to improve data access performance and content discovery. An example method includes receiving hierarchical data in an original hierarchy having a plurality of nodes and a depth of d, generating a respective embedding for each node of the plurality of nodes, and determining, for each node of the plurality of nodes, respective k-nearest neighbors based on the respective embedding. Starting with nodes at depth din the original hierarchy, the method includes generating sibling groups, each sibling group having at least one node at depth d, identifying, for each node at depth d, a similarity-optimized parent from depth d−1, and associating each node at depth d with its respective similarity-optimized parent in a similarity-optimized hierarchy. The method also includes completing the similarity-optimized hierarchy by repeating the generating, identifying and associating with nodes at depth d−1 until reaching the hierarchy root.