Easy way to evaluate / compare the performance of RL algorithm

Question

Saurav Sthapit on 29 Jul 2020

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm

Edited: Saurav Sthapit on 6 Aug 2020

I have a RL agent trained and would like to compare its performance with a dumb agent. I can run simout=sim(env,agent,simOpts) to evaluate the actual agent. But, I would like to compare the simulation results with a couple of dumb agents which always has the same action or random action. Is there any easy way to do this?

Currently, I have a seperate simulink model without RL agent block (replaced with constant block) and logging Observation and rewards using Simulation Data Inspector.

Thanks

Saurav

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Emmanouil Tzorakoleftherakis on 3 Aug 2020

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm#answer_474718

Why not use a MATLAB Fcn block and implement the dummy agent in there? If you want random/constant actions should be just one line.

1 Comment
Show -1 older commentsHide -1 older comments

Saurav Sthapit on 6 Aug 2020

Edited: Saurav Sthapit on 6 Aug 2020

Thanks, thats an excellent suggestion for evaluating random actions.

However, when I do that (or use constant blocks), I have to run two statements below: first one for evaluating random/dumb action and one for evaluating the agent.

logsout=sim(mdl)

simout=sim(env,agent,simOpts)

logsout and simout are not directly comparable, but logsout is a field in the simout.SimulationInfo struct.

I am wondering if this is the best approach or if there is a easy way to do this.

Also, simout contains action, observation and reward but if the reward is weighted sum of multiple rewards, I can't access the individual rewards. ( Of course, i can compare logsout with simout.logsout)

Sign in to comment.

Easy way to evaluate / compare the performance of RL algorithm

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Easy way to evaluate / compare the performance of RL algorithm

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments