Patent attributes
An apparatus in one embodiment comprises at least one processing device having a processor coupled to a memory. The processing device is configured to distribute in-memory computations across a plurality of data processing clusters associated with respective data zones, and to combine local processing results of the distributed in-memory computations from the data processing clusters. The distributed in-memory computations utilize local data structures of respective ones of the data processing clusters. A given one of the local data structures in one of the data processing clusters receives local data of the corresponding data zone and is utilized to generate the local processing results of that data processing cluster that are combined with local processing results of other ones of the data processing clusters. The local data structures are configured to support batch mode extensions such as Spark SQL, Spark MLlib or Spark GraphX for performance of the distributed in-memory computations.