Aspects of the present disclosure relate to data deduplication (dedup) techniques for storage arrays. At least one input/output (IO) operations in an IO workload received by a storage array can be identified. Each of the IOs can relate to a data track of the storage array. a probability of the at least one IO being similar to a previous stored IO can be determined. A data deduplication (dedup) operation can be performed on the at least one IO based on the probability. The probability can be less than one hundred percent (100%).