- Consider an environment containing two agents. The first agent receives an observation belonging to a four-dimensional continuous space and returns an action that can have two values, -1 and 1.
- The second agent receives an observation belonging to a mixed observation space with two channels. The first channel carries a two-dimensional continuous vector, and the second channel carries a value that is either 0 or 1. The action returned by the second agent is a continuous scalar.
- To define the observation and action spaces of the two agents, use cell arrays.
How to run multi-agent reinforcement learning in custom environment based on GYM?
19 views (last 30 days)
Show older comments
Hi,
Recently I followed this link MAT-DL onn github created custom environments based on OpenAI GYM and can be trained with a single agent. My question is how can I create custom environments with GYM that support multi-agent?
Thanks!
0 Comments
Answers (1)
Ronit
on 16 Feb 2024
Edited: Ronit
on 16 Feb 2024
Hi,
I understand that you are trying to create a custom GYM environment with multi-agent system. To achieve this, you can use ‘rlMultiAgentFunctionEnv’ function, which was added in the R2023b release. You will have to install the Reinforcement learning toolbox to use this function.
This function requires you to define the observation and action specifications for your agents and to provide custom MATLAB functions for reset and step functions.
However, as this function was added in the release R2023b, it cannot be used in the earlier versions of MATLAB..
Here is an example of a custom multiagent reinforcement learning environment:
The below code shows how to do it:
obsInfo = { rlNumericSpec([4 1]) , [rlNumericSpec([2 1]) rlFiniteSetSpec([0 1])] };
actInfo = {rlFiniteSetSpec([-1 1]), rlNumericSpec([1 1])};
env = rlMultiAgentFunctionEnv(obsInfo,actInfo, @stepFcn,@resetFcn)
function [initialObs, info] = resetFcn()
% For this example, initialize the agent observations randomly
% (but set to 1 the value carried by the second observation channel of the second agent).
initialObs = {rand(4,1), {rand(2,1) 1} };
% Set the info argument equal to the observation cell.
info = initialObs;
end
function [nextObs, reward, isdone, info] = stepFcn(action, info)
% STEPFUN specifies how the environment advances to the next state given
% the actions from all the agents.
% If N is the total number of agents, then the arguments are as follows.
% - NEXTOBS is a 1xN cell array (s).
% - ACTION is a 1xN cell array.
% - REWARD is a 1xN numeric array.
% - ISDONE is a logical or numeric scalar.
% - INFO contains any data that you want to pass between steps.
% For this example, just return to each agent a random observation multiplied
% by the norm of its respective action.
% The second observation channel of the second agent carries a value that can be only be 0 or 1.
nextObs = { rand([4 1])*norm(action{1}) , {rand([2 1])*norm(action{2}) 0} };
% Return a random reward vector and a false is-done value.
reward = rand(2,1);
isdone = false;
end
Following is the output of the above code:
You can follow this documentation for more details: https://www.mathworks.com/help/reinforcement-learning/ref/rl.env.rlmultiagentfunctionenv.html
Hope this helps!
Ronit Jain
0 Comments
See Also
Categories
Find more on Environments in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!