Patent attributes
Various embodiments of a method and system for determining sets of variant items are described. Various embodiments may include a system configured to generate multiple item pairs each corresponding to a particular item and another item determined to be similar to the particular item. For the particular item and the other item, each item pair may include a respective sequence of text strings (e.g., a title). For each item pair, the system may perform a corresponding text alignment and determine one or more misalignments of the item pair. The system may also assign a similarity score to each item pair; the similarity score may be dependent on the misalignment(s) determined for the particular item pair. Based on each aligned item pair and the similarity score assigned to that aligned item pair, the system may generate an indication specifying that each of a set of items are variants of each other.