Patent attributes
A data deduplication process for storage based on collision resistant hash digests is disclosed. The process accesses a first data message from a data storage appliance and accesses a second data message from the data storage appliance. The process then compares the hash digests of the first and second data messages. If the hash digests match, the process determines if the first and second data messages are the same message or if there is a collision between the compared hash digests by forming additional hash digests based on the first and second data messages by hashing the first and second data messages differently. If this new set of hash digests do not result in a collision, then the first and second data messages are different. If this new set of hash digests result in a collision, the first and second data messages are the same message.