Patent attributes
A bulk data distribution system in which, instead of multiple data consumers contending to access the same data sets from a primary data store, one or more producers capture snapshots (states of the data sets at particular points in time) and upload the snapshots to an intermediate data store for access by snapshot consumers. The snapshot consumers may download the snapshots to generate local versions of the data sets for access by one or more data processing applications or processes. A snapshot producer may periodically generate full snapshots of a data set, and may generate one or more incremental snapshots of the data set between full snapshots. A snapshot consumer may bootstrap a local data set from a full snapshot and one or more incrementals, and may maintain state of the local data set by accessing new snapshots uploaded by the producer.