Main Content


Q-learning reinforcement learning agent


The Q-learning algorithm is a model-free, online, off-policy reinforcement learning method. A Q-learning agent is a value-based reinforcement learning agent which trains a critic to estimate the return or future rewards.

For more information on Q-learning agents, see Q-Learning Agents.

For more information on the different types of reinforcement learning agents, see Reinforcement Learning Agents.




agent = rlQAgent(critic,agentOptions) creates a Q-learning agent with the specified critic network and sets the AgentOptions property.

Input Arguments

expand all

Critic network representation, specified as an rlQValueRepresentation object. For more information on creating critic representations, see Create Policy and Value Function Representations.


expand all

Agent options, specified as an rlQAgentOptions object.

Object Functions

trainTrain reinforcement learning agents within a specified environment
simSimulate trained reinforcement learning agents within specified environment
getActionObtain action from agent or actor representation given environment observations
getActorGet actor representation from reinforcement learning agent
setActorSet actor representation of reinforcement learning agent
getCriticGet critic representation from reinforcement learning agent
setCriticSet critic representation of reinforcement learning agent
generatePolicyFunctionCreate function that evaluates trained policy of reinforcement learning agent


collapse all

Create an environment interface.

env = rlPredefinedEnv("BasicGridWorld");

Create a critic Q-value function representation using a Q-table derived from the environment observation and action specifications.

qTable = rlTable(getObservationInfo(env),getActionInfo(env));
critic = rlQValueRepresentation(qTable,getObservationInfo(env),getActionInfo(env));

Create a Q-learning agent using the specified critic value function and an epsilon value of 0.05.

opt = rlQAgentOptions;
opt.EpsilonGreedyExploration.Epsilon = 0.05;

agent = rlQAgent(critic,opt)
agent = 
  rlQAgent with properties:

    AgentOptions: [1x1 rl.option.rlQAgentOptions]

To check your agent, use getAction to return the action from a random observation.

ans = 1

You can now test and train the agent against the environment.

Introduced in R2019a