rlLSPIAgent
Description
The least square policy iteration (LSPI) algorithm is an off-policy reinforcement learning method for environments with a discrete action space. Similarly to a Q-learning agent, an LSPI agent trains a Q-value function critic to estimate the value of the optimal policy, while following an epsilon-greedy policy based on the value estimated by the critic. The approximation model used by the critic must be a linear-in-the-parameters custom basis function.
For more information on LSPI agents, see LSPI Agent.
For more information on the different types of reinforcement learning agents, see Reinforcement Learning Agents.
Creation
Description
creates an LSPI agent with the specified custom-value-function-based critic. The
agent
= rlLSPIAgent(critic
)AgentOptions
property of agent
is initialized
using default values.
also sets the agent
= rlLSPIAgent(critic
,agentOptions
)AgentOptions
property of agent
using the
agentOptions
argument.
Input Arguments
Properties
Object Functions
train | Train reinforcement learning agents within a specified environment |
sim | Simulate trained reinforcement learning agents within specified environment |
getAction | Obtain action from agent, actor, or policy object given environment observations |
getCritic | Extract critic from reinforcement learning agent |
setCritic | Set critic of reinforcement learning agent |
generatePolicyFunction | Generate MATLAB function that evaluates policy of an agent or policy object |
Examples
Version History
Introduced in R2025a
See Also
Functions
getAction
|getActor
|getCritic
|getModel
|generatePolicyFunction
|generatePolicyBlock
|getActionInfo
|getObservationInfo