Reinforcement Learning Random Action Generator

Question

Jason Smith on 13 Sep 2020

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/593065-reinforcement-learning-random-action-generator

Commented: Jason Smith on 16 Sep 2020

Accepted Answer: Emmanouil Tzorakoleftherakis

Open in MATLAB Online

Greeting. I'm jason a robotics student and I would really appreciate it if you could help me with the questions below.

Consider that we have an RL environment described as follows:

numObs = 10;
ObservationInfo = rlNumericSpec([numObs 1]);
ObservationInfo.Name = 'Robot Observations';
numAct = 15;
ActionInfo = rlNumericSpec([numAct 1]);
ActionInfo.UpperLimit = [5; 5; 2; 2; 1; 3; 6; 5; 6; 5; 1; 1; 1; 1; 1];
ActionInfo.LowerLimit = [1; 1; -2; -2; -2; -6; -12; -5; -6; -3;-1 ;-1 ;-1 ;-1 ;-1];
ActionInfo.Name = 'Robot Actions'; 

1_ In the function step(env, Action), the function takes the Action and nvironment as an input and implements the robot dynamics. In which part of the code should I describe the Action Parameter.

2_ Does the random action generator of a system in RL toolbox generate random action in the range of upper limit and lower limit of the ActionInfo? How does the process of random action generator work?

3_ Is there a way we can define our own random action generator for an RL agent?

Thanks in Advace

Regards

Jason

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Emmanouil Tzorakoleftherakis on 15 Sep 2020

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/593065-reinforcement-learning-random-action-generator#answer_494947

Edited: Emmanouil Tzorakoleftherakis on 15 Sep 2020

Hi Jason,

1) I am not really sure what you mean. There are two ways to create custom environments in MATLAB - one is using custom functions, and the other using a custom class template. If the links don't have the answer you are looking for please let me know.

2) Which algorithm are you referring to? I am assuming you are referring to a continuous method like DDPG since your questions is about respecting bounds. DDPG adds a random value to an action generated by the policy using a noise model. You are responsible for choosing the parameters of the noise model so that exploration happens within your desired range otherwise the actions will always be clipped based on your upper and lower limits. Make sure that you use a tanh and a scaling layer at the end of your actor to shape the action outputs of the policy in your desired range as well (noise will be added on top of that).

3) Again, for DDPG, you can find details of the implemented noise model here. There are many parameters you can change to customize this model, but it is not possible to use a custom one yet (we are working on it).

3 Comments
Show 1 older commentHide 1 older comment

Emmanouil Tzorakoleftherakis on 16 Sep 2020

Random actions are not always between -1 and 1. Values depend on the values you select for Mean and Variance. You do not need to use interpolation, I believe you can generate a noise vector from the provided model that matches your desired range (so select difference mean/variance values for each action)

Jason Smith on 16 Sep 2020

Thank you very much. This really helped

Sign in to comment.

Reinforcement Learning Random Action Generator

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments
Show 1 older commentHide 1 older comment

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Reinforcement Learning Random Action Generator

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments Show 1 older commentHide 1 older comment

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

3 Comments
Show 1 older commentHide 1 older comment