Patent attributes
First and second sequenced outputs are accessed. The sequenced outputs contain variants occurring at different carriers and at different carrier positions. Hashes are generated over a selected pattern length of positions for those carrier positions that are shared between the sequenced outputs to produce window hashes for base patterns in first and second sequences. Each sequence is based on the shared carrier positions and the respective sequenced output. The window hashes are non-unique. Window hashes that occur less than a ceiling number times are selected. The selected window hashes are compared between the sequences on a starting position basis such that selected window hashes for base patterns having same start positions in the sequenced outputs are compared. Common window hashes are identified between the sequences based on the comparing. A similarity measure is determined between the sequences based on the common window hashes.