Patent attributes
A system and method for data deduplication is presented. Data received from one or more computing systems is deduplicated, and the results of the deduplication process stored in a reference table. A representative subset of the reference table is shared among a plurality of systems that utilize the data deduplication repository. This representative subset of the reference table can be used by the computing systems to deduplicate data locally before it is sent to the repository for storage. Likewise, it can be used to allow deduplicated data to be returned from the repository to the computing systems. In some cases, the representative subset can be a proper subset wherein a portion of the referenced table is identified shared among the computing systems to reduce bandwidth requirements for reference-table synchronization.