Reinforcement Learning Random Action Generator
3 views (last 30 days)
Show older comments
Jason Smith
on 13 Sep 2020
Commented: Jason Smith
on 16 Sep 2020
Greeting. I'm jason a robotics student and I would really appreciate it if you could help me with the questions below.
Consider that we have an RL environment described as follows:
numObs = 10;
ObservationInfo = rlNumericSpec([numObs 1]);
ObservationInfo.Name = 'Robot Observations';
numAct = 15;
ActionInfo = rlNumericSpec([numAct 1]);
ActionInfo.UpperLimit = [5; 5; 2; 2; 1; 3; 6; 5; 6; 5; 1; 1; 1; 1; 1];
ActionInfo.LowerLimit = [1; 1; -2; -2; -2; -6; -12; -5; -6; -3;-1 ;-1 ;-1 ;-1 ;-1];
ActionInfo.Name = 'Robot Actions';
1_ In the function step(env, Action), the function takes the Action and nvironment as an input and implements the robot dynamics. In which part of the code should I describe the Action Parameter.
2_ Does the random action generator of a system in RL toolbox generate random action in the range of upper limit and lower limit of the ActionInfo? How does the process of random action generator work?
3_ Is there a way we can define our own random action generator for an RL agent?
Thanks in Advace
Regards
Jason
0 Comments
Accepted Answer
Emmanouil Tzorakoleftherakis
on 15 Sep 2020
Edited: Emmanouil Tzorakoleftherakis
on 15 Sep 2020
Hi Jason,
1) I am not really sure what you mean. There are two ways to create custom environments in MATLAB - one is using custom functions, and the other using a custom class template. If the links don't have the answer you are looking for please let me know.
2) Which algorithm are you referring to? I am assuming you are referring to a continuous method like DDPG since your questions is about respecting bounds. DDPG adds a random value to an action generated by the policy using a noise model. You are responsible for choosing the parameters of the noise model so that exploration happens within your desired range otherwise the actions will always be clipped based on your upper and lower limits. Make sure that you use a tanh and a scaling layer at the end of your actor to shape the action outputs of the policy in your desired range as well (noise will be added on top of that).
3) Again, for DDPG, you can find details of the implemented noise model here. There are many parameters you can change to customize this model, but it is not possible to use a custom one yet (we are working on it).
3 Comments
Emmanouil Tzorakoleftherakis
on 16 Sep 2020
Random actions are not always between -1 and 1. Values depend on the values you select for Mean and Variance. You do not need to use interpolation, I believe you can generate a noise vector from the provided model that matches your desired range (so select difference mean/variance values for each action)
More Answers (0)
See Also
Categories
Find more on Code Generation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!