Patent attributes
Apparatuses, systems and methods are disclosed herein that generally relate to distributed storage, such as for big data, distributed databases, large datasets, artificial intelligence, genomics, or any other data processing environment using that host large data sets or utilize big data hosts using local storage or storage remotely located over a network. More particularly since large scale data requires many storage devices, scrubbing storage for reliability and accuracy requires communication bandwidth and processor resources. Discussed are various ways to use known storage structure, such as LBA, to offload scrubbing overhead to storage by having storage engage in autonomous self-validation. Storage may scrub itself and identify stored data failing data integrity validation, or identify unreadable storage locations, and report errors to a distributed storage system that may reverse-lookup the affected storage location to identify, for example, a data block at that location needing correction.