# How to automatically adjust the y-axis of a boxchart so that outliers are not considered?

23 views (last 30 days)
as132 on 8 Jan 2023
Commented: as132 on 9 Jan 2023
I have a number of boxcharts and some do have very spreadout outliers. How can I automatically adjust the y-axis of those plots so that only the boxplots and the whiskery but not the outliers are considered in determining the y-axis?
I have something like the first image and wish a result like in the second image.  Thanks fo any suggestions!

Adam Danz on 8 Jan 2023
Edited: Adam Danz on 9 Jan 2023
> How can I automatically adjust the y-axis of those plots so that only the boxplots and the whiskery but not the outliers are considered in determining the y-axis?
In boxchart, outliers are defined as values greater or less than 1.5*IQR from the box edges where IQR is the innerquartile range. The box edges are the 25th and 75th quartile of the data. So, the outlier bounds are the 25th quartile minus 1.5*IQR and 75th quartile plus 1.5*IQR. These are the bounds that will be used to define your y axis limit.
For each box in the boxchart, these limits are computed as
iqrng = iqr(ydata);
lower = quantile(ydata, 0.25)-1.5*iqrng;
upper = quantile(ydata, 0.75)+1.5*iqrng;
The y limit will be the minimum lower value between all boxes and the maximum upper value between all boxes. This can be a bit tricky to compute when you're working with grouped boxes.
Here's a demo that creates a boxchart, computes the min and max outlier bound, and sets the y axis limit to the bounds. Don't miss the last section below on "A note on data visualization".
Create boxchart
All you need in your data is the "h" variable which his the handle to your boxchart object.
monthOrder = {'January','February','March','April','May','June','July', ...
'August','September','October','November','December'};
tbl.Month = categorical(tbl.Month,monthOrder);
rng(0)
r = unique(randi(565,1,20));
tbl.TemperatureF(r) = 2*tbl.TemperatureF(r);
w = unique(randi(565,1,20));
tbl.TemperatureF(w) = -1*tbl.TemperatureF(w);
% Create boxchart
h = boxchart(tbl.Month,tbl.TemperatureF,'GroupByColor',tbl.Year);
ylabel('Temperature (F)') Compute limits based on outlier bounds
Replace h with your boxchart object handle.
% Loop through each boxchart object
upperbound = [];
lowerbound = [];
for i = 1:numel(h)
% Compute outlier bounds: box edges +/- (1.5 * IQR)
groups = findgroups(h(i).XData);
qtile.lower = splitapply(@(x)quantile(x,0.25),h(i).YData,groups);
qtile.upper = splitapply(@(x)quantile(x,0.75),h(i).YData,groups);
iqr = qtile.upper - qtile.lower;
upperbound = [upperbound; qtile.upper + 1.5*iqr]; %#ok<*AGROW>
lowerbound = [lowerbound; qtile.lower - 1.5*iqr];
end
ybound = [min(lowerbound), max(upperbound)];
% Set y axis limit
ylim(ybound) A note on data visualization
The chart above is misleading because it hides many outliers that appear to not exist. There are two ways to imrpove this so that your data visualization more accuratly depicts your data.
1. Turn off outliers using set(h, 'MarkerStyle','none'). Note, this is not the same as detecting and removing outliers from your data before plotting. Also note that you'll still need to implement my solution to update the axis limits.
2. Clearly indicate that some outliers are outside of the chart within your text.
as132 on 9 Jan 2023
Perfekt, thank you very much!