Main Content

Agents

Create and configure reinforcement learning agents using common algorithms, such as SARSA, DQN, DDPG, and PPO

A reinforcement learning agent receives observations and a reward from the environment. Using its policy, the agent selects an action based on the observations and reward, and returns the action to the environment. During training, the agent continuously updates the policy parameters based on the action, observations, and reward. Doing so, allows the agent to learn the optimal policy for the given environment and reward signal.

Reinforcement Learning Toolbox™ software provides reinforcement learning agents that use several common algorithms, such as SARSA, DQN, DDPG, and PPO. You can also implement other agent algorithms by creating your own custom agents.

For more information, see Reinforcement Learning Agents. For more information on defining policy representations, see Create Policies and Value Functions.

Apps

Reinforcement Learning DesignerDesign, train, and simulate reinforcement learning agents

Blocks

RL AgentReinforcement learning agent

Functions

expand all

rlQAgentQ-learning reinforcement learning agent
rlSARSAAgentSARSA reinforcement learning agent
rlDQNAgentDeep Q-network (DQN) reinforcement learning agent
rlACAgentActor-critic (AC) reinforcement learning agent
rlPGAgentPolicy gradient (PG) reinforcement learning agent
rlDDPGAgentDeep deterministic policy gradient (DDPG) reinforcement learning agent
rlTD3AgentTwin-delayed deep deterministic (TD3) policy gradient reinforcement learning agent
rlSACAgentSoft actor-critic (SAC) reinforcement learning agent
rlPPOAgentProximal policy optimization (PPO) reinforcement learning agent
rlTRPOAgentTrust region policy optimization (TRPO) reinforcement learning agent
rlQAgentOptionsOptions for Q-learning agent
rlSARSAAgentOptionsOptions for SARSA agent
rlDQNAgentOptionsOptions for DQN agent
rlPGAgentOptionsOptions for PG agent
rlDDPGAgentOptionsOptions for DDPG agent
rlTD3AgentOptionsOptions for TD3 agent
rlACAgentOptionsOptions for AC agent
rlPPOAgentOptionsOptions for PPO agent
rlTRPOAgentOptionsOptions for TRPO agent
rlSACAgentOptionsOptions for SAC agent
rlAgentInitializationOptionsOptions for initializing reinforcement learning agents
rlConservativeQLearningOptionsRegularizer options object to train DQN and SAC agents
rlBehaviorCloningRegularizerOptionsRegularizer options object to train DDPG, TD3 and SAC agents
rlMBPOAgentModel-based policy optimization (MBPO) reinforcement learning agent
rlMBPOAgentOptionsOptions for MBPO agent
getActorExtract actor from reinforcement learning agent
getCriticExtract critic from reinforcement learning agent
setActorSet actor of reinforcement learning agent
setCriticSet critic of reinforcement learning agent
getActionObtain action from agent, actor, or policy object given environment observations
rlReplayMemoryReplay memory experience buffer
rlPrioritizedReplayMemoryReplay memory experience buffer with prioritized sampling
rlHindsightReplayMemoryHindsight replay memory experience buffer
rlHindsightPrioritizedReplayMemoryHindsight replay memory experience buffer with prioritized sampling
appendAppend experiences to replay memory buffer
sampleSample experiences from replay memory buffer
resizeResize replay memory experience buffer
allExperiencesReturn all experiences in replay memory buffer
validateExperienceValidate experiences for replay memory
generateHindsightExperiencesGenerate hindsight experiences from hindsight experience replay buffer
getActionInfoObtain action data specifications from reinforcement learning environment, agent, or experience buffer
getObservationInfoObtain observation data specifications from reinforcement learning environment, agent, or experience buffer
resetReset environment, agent, experience buffer, or policy object

Topics

Agent Basics

Agent Types

Custom Agents