How to plot 2 columns for 5 data set categories with different length using boxplot

I want to plot the Boxplots for 2 repeated variables collected for 5 data sets, where each data set has different length (10x1, 20x1,30x1,40x1,50x1). So i actually want to plot 5 catagories on x-axis, where each catagory will have 2 vertical boxplots. matlab code

 Accepted Answer

There is a very similar example in the documentation for the boxchart function.
To give more specific advice than that, we'd probably need to see how your data are stored. You can upload it using the paper clip icon in the INSERT section of the toolbar.

11 Comments

Thank you really for your comment. I have uploaded the dataset with type struct.
the first group X is total_matching_delay
and the second group Y is total_joint_delay
The 5 categories with different lengths (1x10,1x20,1x30,1x40,1x50)
I need each category to have two columns (X,Y)
I'm waiting for your reply. I apprecitae your help
By far the most difficult part of this is structuring your data into the proper form to do the chart. Upstream, it would have been much better to have already structured the data better (e.g. with different names, in one MAT file, etc.)
That being said, see the code below.
(Note that there are some much more compact ways to write this code, looping over the 10-50 values, but I wanted you to see the basic idea, without needing to understand some more sophisticated methods. In particular, my using dynamic variable names here is a very bad coding practice, but I think in this case will be a better way for you to see what is going on.)
The gist is that for each file, I needed to:
  • Load the data into a structure
  • From the data, create a table that also included information about it's data type and length
  • Concatenate all the individual tables into one table, for plotting purposes
  • Plot the data
% Load the matching data into structures
S10m = load('total_matching_delay10.mat')';
S20m = load('total_matching_delay20.mat')';
S30m = load('total_matching_delay30.mat')';
S40m = load('total_matching_delay40.mat')';
S50m = load('total_matching_delay50.mat')';
% Create tables for each set of matching data
T10m = table(repmat(categorical({'Matching'}),10,1),repmat(categorical(10),10,1),S10m.dtotal_concat','VariableNames',{'Type','Length','Value'});
T20m = table(repmat(categorical({'Matching'}),20,1),repmat(categorical(20),20,1),S20m.dtotal_concat','VariableNames',{'Type','Length','Value'});
T30m = table(repmat(categorical({'Matching'}),30,1),repmat(categorical(30),30,1),S30m.dtotal_concat','VariableNames',{'Type','Length','Value'});
T40m = table(repmat(categorical({'Matching'}),40,1),repmat(categorical(40),40,1),S40m.dtotal_concat','VariableNames',{'Type','Length','Value'});
T50m = table(repmat(categorical({'Matching'}),50,1),repmat(categorical(50),50,1),S50m.dtotal_concat','VariableNames',{'Type','Length','Value'});
% Load the joint data into structures
S10j = load('total_joint_delay10.mat')';
S20j = load('total_joint_delay20.mat')';
S30j = load('total_joint_delay30.mat')';
S40j = load('total_joint_delay40.mat')';
S50j = load('total_joint_delay50.mat')';
% Create tables for each set of joint data
T10j = table(repmat(categorical({'Joint'}),10,1),repmat(categorical(10),10,1),S10j.dtotal_concat','VariableNames',{'Type','Length','Value'});
T20j = table(repmat(categorical({'Joint'}),20,1),repmat(categorical(20),20,1),S20j.dtotal_concat','VariableNames',{'Type','Length','Value'});
T30j = table(repmat(categorical({'Joint'}),30,1),repmat(categorical(30),30,1),S30j.dtotal_concat','VariableNames',{'Type','Length','Value'});
T40j = table(repmat(categorical({'Joint'}),40,1),repmat(categorical(40),40,1),S40j.dtotal_concat','VariableNames',{'Type','Length','Value'});
T50j = table(repmat(categorical({'Joint'}),50,1),repmat(categorical(50),50,1),S50j.dtotal_concat','VariableNames',{'Type','Length','Value'});
% Concatenate all the data into one table
tbl = [T10m; T20m; T30m; T40m; T50m; T10j; T20j; T30j; T40j; T50j];
% Make the box plot
figure
boxchart(tbl.Length,tbl.Value,'GroupByColor',tbl.Type)
ylabel('Length')
legend('Location','NorthWest')
Really, I can not thank you enough for your help.
I deeply appreciate your efforts.
I have a last trivial question, do you have what this sign mean ?
Thanks you very much.
Those are outliers -- data points that lie beyond the end of the "whisker". Documentation in either the boxchart of boxplot function will explain more.
Please kindly find the link for other dataset.
I used the same above code with changing the dimesnion to fit the dataset.
However, my dataset includes many outliers. Thus, in the plot, I can not see the bar as the above solved example as shown below
S10m = load('total_matching_delay2.mat')';
S20m = load('total_matching_delay4.mat')';
S30m = load('total_matching_delay6.mat')';
S40m = load('total_matching_delay8.mat')';
S50m = load('total_matching_delay10.mat')';
% Create tables for each set of matching data
T10m = table(repmat(categorical({'Matching Algorithm'}),100,1),repmat(categorical(100),100,1),S10m.dtotal_concat','VariableNames',{'Type','Length','Value'});
T20m = table(repmat(categorical({'Matching Algorithm'}),200,1),repmat(categorical(200),200,1),S20m.dtotal_concat','VariableNames',{'Type','Length','Value'});
T30m = table(repmat(categorical({'Matching Algorithm'}),300,1),repmat(categorical(300),300,1),S30m.dtotal_concat','VariableNames',{'Type','Length','Value'});
T40m = table(repmat(categorical({'Matching Algorithm'}),400,1),repmat(categorical(400),400,1),S40m.dtotal_concat','VariableNames',{'Type','Length','Value'});
T50m = table(repmat(categorical({'Matching Algorithm'}),500,1),repmat(categorical(500),500,1),S50m.dtotal_concat','VariableNames',{'Type','Length','Value'});
% Load the joint data into structures
S10j = load('total_joint_delay2.mat')';
S20j = load('total_joint_delay4.mat')';
S30j = load('total_joint_delay6.mat')';
S40j = load('total_joint_delay8.mat')';
S50j = load('total_joint_delay10.mat')';
% Create tables for each set of joint data
T10j = table(repmat(categorical({'Joint Algorithm'}),100,1),repmat(categorical(100),100,1),S10j.dtotal_concat','VariableNames',{'Type','Length','Value'});
T20j = table(repmat(categorical({'Joint Algorithm'}),200,1),repmat(categorical(200),200,1),S20j.dtotal_concat','VariableNames',{'Type','Length','Value'});
T30j = table(repmat(categorical({'Joint Algorithm'}),300,1),repmat(categorical(300),300,1),S30j.dtotal_concat','VariableNames',{'Type','Length','Value'});
T40j = table(repmat(categorical({'Joint Algorithm'}),400,1),repmat(categorical(400),400,1),S40j.dtotal_concat','VariableNames',{'Type','Length','Value'});
T50j = table(repmat(categorical({'Joint Algorithm'}),500,1),repmat(categorical(500),500,1),S50j.dtotal_concat','VariableNames',{'Type','Length','Value'});
% Concatenate all the data into one table
tbl = [T10m; T20m; T30m; T40m; T50m; T10j; T20j; T30j; T40j; T50j];
% Make the box plot
figure
boxchart(tbl.Length,tbl.Value,'GroupByColor',tbl.Type);
xlabel('Number of TNs')
ylabel('Total sum delay')
legend('Location','NorthWest')
k;
Please can you help ? Thank you.
After the figure is created, set the limits on the Y axis:
set(gca,"YLim",[0 5])
I mentioned that there was a more compact way of writing this code (rather than using dynamically named variables). It's also more robust against typos and other errors.
It could be made even more "automatic" by detecting the names of the files in the directory.
% Initialize empty table
var_names_types = [["Type", "categorical"]; ...
["Length", "categorical"]; ...
["Value", "double"]; ...
];
% Make table using fieldnames & value types from above
tbl = table('Size',[0,size(var_names_types,1)],...
'VariableNames', var_names_types(:,1),...
'VariableTypes', var_names_types(:,2));
% Loop over the data files
for n = [2 4 6 8 10]
% Load "matching" data into table
S = load(sprintf('total_matching_delay%d.mat',n));
TM = table(repmat({'Matching'},numel(S.dtotal_concat),1) ,repmat(categorical(n), numel(S.dtotal_concat),1), S.dtotal_concat', 'VariableNames',{'Type','Length','Value'});
% Load "joint" data into table
S = load(sprintf('total_joint_delay%d.mat',n));
TJ = table(repmat({'Joint'}, numel(S.dtotal_concat),1) ,repmat(categorical(n), numel(S.dtotal_concat),1), S.dtotal_concat', 'VariableNames',{'Type','Length','Value'});
% Append new data to existing table
tbl = [tbl; TM; TJ];
end
% Make the box plot
figure
boxchart(tbl.Length,tbl.Value,'GroupByColor',tbl.Type)
ylabel('Length')
legend('Location','NorthWest')
% % Uncomment this line to adjust the y limit, if needed
% set(gca,"YLim",[0 5])
Really thank you very much for your help. I deeply apprecitae your help.
Please, my second dataset which I upload through google drive, have many outliers, ho can I remove them please so as not to be appeared in the graph.?
Thank you very much.
Highly appreciated.
Call the boxchart with MarkerStyle set to none:
boxchart(tbl.Length,tbl.Value,'GroupByColor',tbl.Type,'MarkerStyle','none')
This is described in the boxchart documenation.
Please do you know how to change the color of one if those two boxbar plots???
Thank you so much
Set the BoxFaceColor and MarkerColor property of that boxchart object.
Here is an example, based on the documentation:
tbl = readtable('TemperatureData.csv');
monthOrder = {'January','February','March','April','May','June','July', ...
'August','September','October','November','December'};
tbl.Month = categorical(tbl.Month,monthOrder);
figure
hb = boxchart(tbl.Month,tbl.TemperatureF,'GroupByColor',tbl.Year); % <--- Note that I assigned the handle hb to the boxchart
ylabel('Temperature (F)')
legend
% Pick a color. (Here I used an RGB value, but there are other options. See
% documentation.)
newColor = [0 1 1];
% Set the box and marker color to the new color
hb(2).BoxFaceColor = newColor;
hb(2).MarkerColor = newColor;

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!