Deep reinforcement learning is applied to self-orchestration in edge device computing for offloading within a spatial network community to reduce latency and bandwidth issues. A revised online policy gradient training algorithm based on importance sampling in addition to the use of DRL-based offloading provides for continued use of original sample training data. A request for help scheme supports edge-device cooperation among neighboring devices of the spatial network community by sharing edge device state information (EDSI) for governing task assignments.