Patent attributes
Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One system includes a high-level controller neural network, low-level controller network, and subsystem. The high-level controller neural network receives an input observation and processes the input observation to generate a high-level output defining a control signal for the low-level controller. The low-level controller neural network receives a designated component of an input observation and processes the designated component and an input control signal to generate a low-level output that defines an action to be performed by the agent in response to the input observation. The subsystem receives a current observation characterizing a current state of the environment, determines whether criteria are satisfied for generating a new control signal, and based on the determination, provides appropriate inputs to the high-level and low-level controllers for selecting an action to be performed by the agent.