Software attributes
Other attributes
TensorFlow Agents (TF-Agents) is an open-source, efficient infrastructure paradigm for building parallel reinforcement learning algorithms in TensorFlow.
TF-Agents simulates multiple environments in parallel, and groups them to perform the neural network computation on a batch rather than individual observations. This allows the TensorFlow execution engine to parallelize computation, without the need for manual synchronization. Environments are stepped in separate Python processes to progress them in parallel without interference of the global interpreter lock.
An "agent" is a core element of reinforcement learning which encompasses two main responsibilities:
- defining a Policy to interact with the Environment; and
- determining how to learn/train that Policy from collected experience.
Currently the following algorithms are available under TF-Agents:
- DQN: Human level control through deep reinforcement learning.
- DDQN: Deep Reinforcement Learning with Double Q-learning.
- DDPG: Continuous control with deep reinforcement learning.
- TD3: Addressing Function Approximation Error in Actor-Critic Methods.
- REINFORCE: Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning.
- PPO: Proximal Policy Optimization Algorithms.
- SAC: Soft Actor Critic.
In their paper, TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow, authors Hafter, Davidson, and Vanhoucke also introduced BatchPPO, which is an efficient implementation of the proximal policy optimization algorithm.