Can I make a custom RL agent that has 2 distinct critics and 1 actor?
    3 views (last 30 days)
  
       Show older comments
    
So I'm trying to create a custom DDPG agent that has 2 critics. It's not like 2 critics like a TD3 where we choose the minimum Q value. Each critic agent is going to learn a different portion of the environment and the reward is going to be combined. To give context, I'm trying to train a manipulator to grab a target while avoiding obstacles. So 1 critic will be for tracking and closing the distance to the target, the other critic will be for avoiding obstacles. And the Qtotal is the sum of each critic coresponinding Qvalue. 
I also need to mention that the enviroment is implemented in a Simulink model. So I'm hoping to be able to use the reinforcement learning train() function and pass in something like train(customagent, Simulinkenv, Trainoptions)
Is this possible to implement with a custom class RL agent while using the built in train() function? I heard the matlab supports training 2 agents, I'm willing to use that as a last resort but I'm trying to ideally have 1 agent that has 1 actor and 2 critics.
Thanks
0 Comments
Answers (2)
  Maneet Kaur Bagga
      
 on 3 Apr 2025
        Hi,
As per my understanding, you can implement a custom RL agent with one actor and two critics while still using MATLAB's built-in "train()" function. To do the same you can create a custom agent class that adheres to MATLAB's "rl.agent.CustomAgent" framework.
Please refer to the following MathWorks documentation to create custom agents by subclassing "rl.agent.CustomAgent", which is essential for implementing agents with unique architectures, such as having multiple critics. 
Train the Reinforcement Learning Agents using "train" function, the following example demonstrates how to create and train a custom PG agent, showcasing the process of defining custom agents and training them using the built-in "train" function.
Hope this helps!
0 Comments
  Aravind
      
 on 3 Apr 2025
        From your question, it seems you want to implement a custom Reinforcement Learning (RL) agent, specifically a Deep Deterministic Policy Gradient (DDPG) agent with two critic networks that learn different things and a single actor network, while still using the "train" function to train the RL agent in the environment.
To achieve this, you need to create a custom agent class that inherits from the "rl.agent.CustomAgent" class. This is an internal class compatible with the "train" function. By implementing the required methods in a custom agent class that inherits from "rl.agent.CustomAgent", you can use the "train" function just like with predefined RL agents. More information on implementing a custom agent class can be found at: https://www.mathworks.com/help/releases/R2024a/reinforcement-learning/ug/create-custom-pg-agent.html.
In your custom agent class, you need to implement a constructor that performs the following tasks: 
- Defines the action and observation specifications.
- Sets the agent properties.
- Calls the constructor of the base abstract class.
- Defines the sample time (necessary for training in Simulink environments).
Additionally, you need to implement three essential functions: 
- getActionImpl — Evaluates the agent policy and selects an action during simulation.
- getActionWithExplorationImpl — Evaluates the policy and selects an action with exploration during training.
- learnImpl — Updates learnable parameters, allowing the agent to learn from the current experience.
For your specific case, initialize the two critic networks and the actor network in the constructor. In the learnImpl function, implement the DDPG algorithm that uses the total Q value from the two Q networks to update the critic and actor networks' learnable parameters. You can also add a function to calculate the total Q-value from both networks.
Here's a basic skeleton of the class you might need: 
classdef CustomDDPGAgent < rl.agent.CustomAgent
    properties
        Actor
        Critic1
        Critic2
    end
    methods
        function obj = CustomDDPGAgent(actor, critic1, critic2)
            % Initialize actor and critics
            obj.Actor = actor;
            obj.Critic1 = critic1;
            obj.Critic2 = critic2;
            % Initialize other parameters …
        end
    end
    methods (Access = protected)
        function action = getActionImpl(obj, observation)
            % Implement action selection logic using the two Q networks
        end
        function getActionWithExplorationImpl(obj, experience)
            % Implement the action selection with exploration
        end
        function learnImpl(obj)
            % Implement the learning algorithm (DDPG) using the Critic networks
            % Use the computeQTotal function to calculate the total Q value
        end
        function qTotal = computeQTotal(obj, observation, action)
            q1 = obj.Critic1.evaluate(observation, action);
            q2 = obj.Critic2.evaluate(observation, action);
            qTotal = q1 + q2;
        end
        % Implement other functions like reset, and other helper functions
    end
end
You can use the following code to train the network:
% Define your Simulink environment
env = rlSimulinkEnv('modelName', 'blockName', observationInfo, actionInfo);
% Create actor and critics
actor = rlContinuousDeterministicActor(observationInfo, actionInfo, actorNetwork);
critic1 = rlQValueFunction(observationInfo, actionInfo, criticNetwork1);
critic2 = rlQValueFunction(observationInfo, actionInfo, criticNetwork2);
% Create custom agent
agent = CustomDDPGAgent(actor, critic1, critic2);
% Define training options
trainOpts = rlTrainingOptions('MaxEpisodes', 1000, 'MaxStepsPerEpisode', 200);
% Train the agent
trainResults = train(agent, env, trainOpts);
You can also refer to the following example that implements a custom LQR Agent to get more information on how to use “rl.agent.CustomAgent” to implement custom RL Agents: https://www.mathworks.com/help/releases/R2024a/reinforcement-learning/ug/create-custom-agents.html.
I hope this helps resolve your query!
1 Comment
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

