MathWorks Machine Translation
The automated translation of this page is provided by a general purpose third party translator tool.
MathWorks does not warrant, and disclaims all liability for, the accuracy, suitability, or fitness for purpose of the translation.
Create options for reinforcement learning agent representations
repOpts = rlRepresentationOptions
repOpts = rlRepresentationOptions(Name,Value)
returns the
default options for defining a representation for a reinforcement learning agent. repOpts
= rlRepresentationOptions
option set using the specified namevalue pairs to override default option values.repOpts
= rlRepresentationOptions(Name,Value
)
Create an options set for creating a critic or actor representation for a reinforcement learning agent. Set the learning rate for the representation to 0.05, and set the gradient threshold to 1. You can set the options using Name,Value pairs when you create the options set. Any options that you do not explicitly set have their default values.
repOpts = rlRepresentationOptions('LearnRate',5e2,... 'GradientThreshold',1)
repOpts = rlRepresentationOptions with properties: LearnRate: 0.0500 Optimizer: "adam" OptimizerParameters: [1×1 rl.option.OptimizerParameters] GradientThreshold: 1 GradientThresholdMethod: "l2norm" L2RegularizationFactor: 1 UseDevice: "CPU" MiniBatchSize: Inf
Alternatively, create a default options set and use dot notation to change some of the values.
repOpts = rlRepresentationOptions; repOpts.LearnRate = 5e2; repOpts.GradientThreshold = 1
repOpts = rlRepresentationOptions with properties: LearnRate: 0.0500 Optimizer: "adam" OptimizerParameters: [1×1 rl.option.OptimizerParameters] GradientThreshold: 1 GradientThresholdMethod: "l2norm" L2RegularizationFactor: 1 UseDevice: "CPU" MiniBatchSize: Inf
If you want to change the properties of the OptimizerParameters
option, use dot notation to access them.
repOpts.OptimizerParameters.Epsilon = 1e7; repOpts.OptimizerParameters
ans = OptimizerParameters with properties: Momentum: "Not applicable" Epsilon: 1.0000e07 GradientDecayFactor: 0.9000 SquaredGradientDecayFactor: 0.9990
Specify optional
commaseparated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
'Optimizer',"rmsprop"
'LearnRate'
— Learning rate for the representationLearning rate for the representation, specified as the commaseparated pair
consisting of 'LearnRate'
and a positive scalar. If the learning
rate is too low, then training takes a long time. If the learning rate is too high,
then training might reach a suboptimal result or diverge.
Example: 'LearnRate',0.025
'Optimizer'
— Optimizer for representation"adam"
(default)  "sgdm"
 "rmsprop"
Optimizer for training the network of the representation, specified as the
commaseparated pair consisting of 'Optimizer'
and one of the
following strings:
"adam"
— Use the Adam optimizer. You can specify the
decay rates of the gradient and squared gradient moving averages using the
GradientDecayFactor
and
SquaredGradientDecayFactor
fields of the
OptimizerParameters
option.
"sgdm"
— Use the stochastic gradient descent with
momentum (SGDM) optimizer. You can specify the momentum value using the
Momentum
field of the
OptimizerParameters
option.
"rmsprop"
— Use the RMSProp optimizer. You can specify
the decay rate of the squared gradient moving average using the
SquaredGradientDecayFactor
fields of the
OptimizerParameters
option.
For more information about these optimizers, see Stochastic Gradient Descent (Deep Learning Toolbox)
in the Algorithms section of trainingOptions
in Deep Learning Toolbox™.
Example: 'Optimizer',"sgdm"
'OptimizerParameters'
— Applicable parameters for optimizerOptimizerParameters
objectApplicable parameters for the optimizer, specified as the commaseparated pair
consisting of 'OptimizerParameters'
and an
OptimizerParameters
object.
The OptimizerParameters
object has the following
properties.
Momentum  Contribution of previous step, specified as a scalar from 0 to 1. A value of 0 means no contribution from the previous step. A value of 1 means maximal contribution. This parameter applies only when

Epsilon  Denominator offset, specified as a positive scalar. The optimizer adds this offset to the denominator in the network parameter updates to avoid division by zero. This parameter applies only when

GradientDecayFactor  Decay rate of gradient moving average, specified as a positive scalar from 0 to 1. This parameter applies only when

SquaredGradientDecayFactor  Decay rate of squared gradient moving average, specified as a positive scalar from 0 to 1. This parameter applies only when

When a particular property of OptimizerParameters
is not
applicable to the optimizer type specified in the Optimizer
option,
that property is set to "Not applicable"
.
To change the default values, create an
rlRepresentationOptions
set and use dot notation to access and
change the properties of OptimizerParameters
.
repOpts = rlRepresentationOptions; repOpts.OptimizerParameters.Epsilon = 1e7;
'GradientThreshold'
— Threshold value for gradientInf
(default)  positive scalarThreshold value for the representation gradient, specified as the commaseparated
pair consisting of 'GradientThreshold'
and Inf
or a positive scalar. If the gradient exceeds this value, the gradient is clipped as
specified by the GradientThresholdOption
. Clipping the gradient
limits how much the network parameters change in a training iteration.
Example: 'GradientThreshold',1
'GradientThresholdMethod'
— Gradient threshold method"l2norm"
(default)  "globall2norm"
 "absolutevalue"
Gradient threshold method used to clip gradient values that exceed the gradient
threshold, specified as the commaseparated pair consisting of
'GradientThresholdMethod'
and one of the following strings:
"l2norm"
— If the
L_{2} norm of the gradient of a learnable
parameter is larger than GradientThreshold
, then scale the
gradient so that the L_{2} norm equals
GradientThreshold
.
"globall2norm"
— If the global
L_{2} norm, L, is
larger than GradientThreshold
, then scale all gradients by a
factor of GradientThreshold/
L. The global
L_{2} norm considers all learnable
parameters.
"absolutevalue"
— If the absolute value of an individual
partial derivative in the gradient of a learnable parameter is larger than
GradientThreshold
, then scale the partial derivative to have
magnitude equal to GradientThreshold
and retain the sign of the
partial derivative.
For more information, see Gradient Clipping (Deep Learning Toolbox) in the Algorithms
section of trainingOptions
in Deep Learning Toolbox.
Example: 'GradientThresholdMethod',"absolutevalue"
'L2RegularizationFactor'
— Factor for L_{2} regularizationFactor for L_{2} regularization (weight
decay), specified as the commaseparated pair consisting of
'L2RegularizationFactor'
and a nonnegative scalar. For more
information, see L2 Regularization (Deep Learning Toolbox) in the Algorithms section of trainingOptions
in Deep Learning Toolbox.
To avoid overfitting when using a representation with many parameters, consider
increasing the L2RegularizationFactor
option.
Example: 'L2RegularizationFactor',0.0005
'UseDevice'
— Computation device for training"cpu"
(default)  "gpu"
Computation device for training an agent that uses the representation, specified
as the commaseparated pair consisting of 'UseDevice'
and either
"cpu"
or "gpu"
.
The "gpu"
option requires Parallel
Computing Toolbox™. To use a GPU for training a network, you must also have a CUDA^{®} enabled NVIDIA^{®} GPU with compute capability 3.0 or
higher.
Example: 'UseDevice',"gpu"
repOpts
— Representation optionsrlRepresentationOptions
objectOption set for defining a representation for a reinforcement learning agent.,
returned as an rlRepresentationgOptions
object. The property values
of repOpts
are initialized to the default values or to the values
you specify with Name,Value
pairs. You can further modify the
property values using dot notation. Use the options set as an input argument with
rlRepresentation
when you create reinforcement learning representations.
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
Select web siteYou can also select a web site from the following list:
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.