Patent attributes
A method, system and computer readable program storage device for performing data deduplication. In an embodiment, the method comprises receiving input data for storage in a data storage. The input data comprises a multitude of data blocks, and the data blocks are accessed at different times in the data storage by a given application. The method further comprises selecting, by a processor device, one or more of the data blocks for data deduplication based on when the data blocks are accessed by the given application. In an embodiment, the selecting data blocks for data deduplication includes selecting data blocks for deduplication to obtain a target deduplication ratio. In an embodiment, the selecting data blocks for data deduplication includes selecting for the deduplication data blocks that are accessed later by the given application relative to data blocks that are accessed earlier by the given application.