How to do boxplot for vectors with different sizes

35 views (last 30 days)
I have four vectors (A1, A2, A3, A4) which have different length, I am trying to generate boxplot for these data:
boxplot([A1, A2, A3, A4], 'notch', 'on', 'color', [0 0 0], 'outliersize',0, 'labels',{'data1', 'data2', 'data3', 'data4'});
but I have an error: Error in boxplot (line 286) [groupIndexByPoint,groupVisibleByPoint,labelIndexByGroup,gLevelsByGroup,...
Does anyone know what is wrong?

Accepted Answer

Brendan Hamm
Brendan Hamm on 11 Aug 2015
You are passing in all of your data as one big long row vector (that is A1,...,A4 are all row vectors and you concatenate to a larger row vector with [A1,A2,A3,A4]), but boxplot treats different columns as different variables, or in the case of a row vector it will treat this as many observations from a single variable. This is fine (excepting that it does not have the results you want) but you then provide 4 labels and thus the error. For this reason you need to pass in a grouping variable as well, which will describe which group the data came from:
G = [ones(size(A1)) 2*ones(size(A2)) 3*ones(size(A2)) 4*ones(size(A2))];
X = [A1, A2, A3, A4];
boxplot(X,G,'notch','on','colors',[0 0 0],'symbol','','labels'{'data1','data2','data3','data4'});
Note, if you want to get rid of the outliers you want to set the symbol to the empty string as opposed to setting the outliersize to 0.
  2 Comments
Karolina
Karolina on 12 Aug 2015
Edited: Karolina on 12 Aug 2015
Thank you, it works perfectly!
Below is the final code, because I've found two small mistakes in your code.
G = [ones(size(A1)) 2*ones(size(A3)) 3*ones(size(A4)) 4*ones(size(A4))];
X = [A1, A2, A3, A4];
boxplot(X,G,'notch','on','colors',[0 0 0],'symbol','','labels',{'data1','data2','data3','data4'});
Brendan Hamm
Brendan Hamm on 12 Aug 2015
Ah yes! Copy and paste can be a killer. Glad you caught it!

Sign in to comment.

More Answers (1)

Foroogh Hajiseyedjavadi
Foroogh Hajiseyedjavadi on 16 Jun 2018
Edited: Foroogh Hajiseyedjavadi on 16 Jun 2018
Thanks, @Brendan Hamm for the help. I had a large dataset and the method you posted worked great, just needed to work on it a bit to make it more generic for a large data. I share this code based on your help, in case anyone else is dealing with a large dataset like me. hope it helps!:)
grp=[]; %grouping matrix
for n=1:45
grp=vertcat(grp,n*ones(sizes(n),1));%sizes is a variable with n column each column stores the size of each variable
end
figure();
boxplot(cell2mat(speeddata(~cellfun(@isempty,speeddata))),grp)%speeddata is a cell array with 45 columns of data, each column with different size

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!