Patent attributes
In one aspect, a method for similarity sharding of datatype items is provided. The method includes a set of operations or steps, including parsing a datatype item into one or more tokens, extracting at least one selected token from the parsed datatype item, the at least one selected token comprising a character string including one or more characters. The method further includes standardizing the character string of the at least one selected token, extracting a first character from the one or more characters included in the at least one standardized selected token, and assigning the datatype item to a select shard of a plurality of shards via character distribution lookup based on the extracted first character.