Patent attributes
Described is a system for controlling multiple autonomous platforms. A training process is performed to produce a trained learning agent in a simulation environment. In each episode, each controlled platform is assigned to one target platform that produces an observation. A learning agent processes the observation using a deep learning network and produces an action corresponding to each controlled platform until an action has been produced for each controlled platform. A reward value is obtained corresponding to the episode. The trained learning agent is executed to control each autonomous platform, where the trained agent receives one or more observations from one or more platform sensors and produces an action based on the one or more observations. The action is then used to control one or more platform actuators.