training is not efficient for RL agent
Show older comments
Hello,
I am trying to use the Reinforcement Learning toolbox for an energy optimization problem. I have started by a simple RL agent the DQN and the critic network code is as shown below
nI = 4; % number of inputs (4)
nL = 400; % number of neurons
nO = 101; % number of possible outputs (101)
dnn = [
featureInputLayer(nI,'Normalization','none','Name','state')
fullyConnectedLayer(nL,'Name','fc1')
reluLayer('Name','relu1')
fullyConnectedLayer(nL/2,'Name','fc2')
reluLayer('Name','relu2')
fullyConnectedLayer(nL/4,'Name','fc3')
reluLayer('Name','relu3')
fullyConnectedLayer(nO,'Name','fc4')];
figure(1)
plot(layerGraph(dnn)
I have used the following options for the critic, agent, and training respectively.
criticOpts = rlRepresentationOptions('LearnRate',0.1,'GradientThreshold',1,...
'UseDevice','gpu');
agentOpts = rlDQNAgentOptions(...
'UseDoubleDQN',false, ...
'ExperienceBufferLength',1e5, ...
'DiscountFactor',0.99, ...
'MiniBatchSize',256,...
'SaveExperienceBufferWithAgent',true,...
'SampleTime',1);
trainOpts = rlTrainingOptions(...
'MaxEpisodes',1000,...
'MaxStepsPerEpisode',3000,...
'StopTrainingCriteria',"AverageReward",...
'StopTrainingValue',0,...
'Verbose',false,...
'Plots',"training-progress",...
'SaveAgentDirectory','D:\ADVISOR_Exp\RL_Exp');
trainstats=train(agent,env,trainOpts);
However I did not get good results when testing the agent, the agent does not evolve with time and after several hundered episodes the reward still oscilates as shown in the figure.

I have tried different critic network archituctures (with state and action paths) and different agents (Q-learnign agent, and DDPG) with similar options but no luck. And also I have tried using different reward and I have tuned the reward function. What should I do to
Answers (0)
Categories
Find more on Reinforcement Learning in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!