US Patent 8086799 Scalable deduplication of stored data

In a method and apparatus for scalable deduplication, a data set is partitioned into multiple logical partitions, where each partition can be deduplicated independently. Each data block of the data set is assigned to exactly one partition, so that any two or more data blocks that are duplicates of each are always be assigned to the same logical partition. A hash algorithm generates a fingerprint of each data block in the volume, and the fingerprints are subsequently used to detect possible duplicate data blocks as part of deduplication. In addition, the fingerprints are used to ensure that duplicate data blocks are sent to the same logical partition, prior to deduplication. A portion of the fingerprint of each data block is used as a partition identifier to determine the partition to which the data block should be assigned. Once blocks are assigned to partitions, deduplication can be done on partitions independently.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 8086799 Scalable deduplication of stored data

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 8086799 Scalable deduplication of stored data