An apparatus comprises a plurality of processor cores, each comprising a computation unit and a memory. The apparatus further comprises an interconnection network to transmit data among the processor cores. At least some of the memories are configured as a cache for memory external to the processor cores, and at least some of the processor cores are configured to transmit a message over the interconnection network to access a cache of another processor core.