Patent attributes
A machine-learning model may be previously trained with a supervised learning algorithm to identify whether a pair of labels provided as input are similar. A locality sensitive hashing forest (LSH) may be generated for the set of candidate labels. When a user later identifies an input label (e.g., by search query, by interface selection, etc.) the input label may be used to query the LSH forest to identify a subset of the candidate labels. This subset may be used to generate respective pairs comprising the input label, one of the subset candidate labels, and a corresponding feature set generated for the pair. This data may be provided to the model to identify a degree to which the pair of labels are similar. The user may be provided one or more recommendations including similar terms identified from the model's output.