Patent 8442956 was granted and assigned to Wells Fargo Capital Finance on May, 2013 by the United States Patent and Trademark Office.
Example apparatus, methods, and computers perform sampling based data de-duplication. One example method controls a data de-duplication computer to compute a sampling sequence for a sub-block of data and to use the sampling sequence to locate a stored sub-block known to the data de-duplication computer. Upon finding a stored sub-block to compare to, the method includes controlling the data de-duplication computer to determine a degree of similarity (e.g., duplicate, very similar, somewhat similar, very dissimilar, completely dissimilar, x % similar) between the sub-block and the stored sub-block and to control whether and how the sub-block is stored and/or transmitted based on the degree of similarity. The degree of similarity can also control whether and how the data de-duplication computer updates a dedupe data structure(s) that stores information for finding groups of similarity sampling sequence related sub-blocks.