Patent attributes
A deduplication index is generated having multiple entries, each entry storing a digest of a data block that was previously stored in non-volatile data storage together with a pointer to the location in non-volatile storage at which the data block was previously stored. The entries of the disclosed deduplication index are divided into multiple deduplication index segments. A resident subset of the deduplication index segments is stored in memory of the data storage system. A non-resident subset of the deduplication index segments is stored in non-volatile data storage of the data storage system. Data deduplication is performed for each subsequently received data block for which a digest is generated that matches any one of the digests in the entries of the deduplication index segments that are contained in the resident subset of the deduplication index segments.