Patent attributes
A storage manager provides data privacy, while preserving the benefits provided by existing hash based storage systems. Each file is assigned a unique identifying code. Hashes of the content-derived chunks of the file are calculated based on the content of the chunk and the code identifying the file. When a request to store a chunk of data is received, it is determined whether a chunk associated with the hash has already been stored. Because hashes are based on privacy-preserving codes as well as content, chunks of duplicate copies of a file need not be stored multiple times, and yet privacy is preserved for content at a file level. In other embodiments, hashes indicating whether a given file is public and/or indicating the identity of the requesting user are also sent with storage requests. These additional hashes enable more robust transmission and storage efficiency, while still preserving privacy.