How to automatically adjust the y-axis of a boxchart so that outliers are not considered?

23 views (last 30 days)
I have a number of boxcharts and some do have very spreadout outliers. How can I automatically adjust the y-axis of those plots so that only the boxplots and the whiskery but not the outliers are considered in determining the y-axis?
I have something like the first image and wish a result like in the second image.
Thanks fo any suggestions!

Accepted Answer

Adam Danz
Adam Danz on 8 Jan 2023
Edited: Adam Danz on 9 Jan 2023
> How can I automatically adjust the y-axis of those plots so that only the boxplots and the whiskery but not the outliers are considered in determining the y-axis?
In boxchart, outliers are defined as values greater or less than 1.5*IQR from the box edges where IQR is the innerquartile range. The box edges are the 25th and 75th quartile of the data. So, the outlier bounds are the 25th quartile minus 1.5*IQR and 75th quartile plus 1.5*IQR. These are the bounds that will be used to define your y axis limit.
For each box in the boxchart, these limits are computed as
iqrng = iqr(ydata);
lower = quantile(ydata, 0.25)-1.5*iqrng;
upper = quantile(ydata, 0.75)+1.5*iqrng;
The y limit will be the minimum lower value between all boxes and the maximum upper value between all boxes. This can be a bit tricky to compute when you're working with grouped boxes.
Here's a demo that creates a boxchart, computes the min and max outlier bound, and sets the y axis limit to the bounds. Don't miss the last section below on "A note on data visualization".
Create boxchart
All you need in your data is the "h" variable which his the handle to your boxchart object.
% Load and prepare data
tbl = readtable('TemperatureData.csv');
monthOrder = {'January','February','March','April','May','June','July', ...
tbl.Month = categorical(tbl.Month,monthOrder);
% Add more outliers
r = unique(randi(565,1,20));
tbl.TemperatureF(r) = 2*tbl.TemperatureF(r);
w = unique(randi(565,1,20));
tbl.TemperatureF(w) = -1*tbl.TemperatureF(w);
% Create boxchart
h = boxchart(tbl.Month,tbl.TemperatureF,'GroupByColor',tbl.Year);
ylabel('Temperature (F)')
Compute limits based on outlier bounds
Replace h with your boxchart object handle.
% Loop through each boxchart object
upperbound = [];
lowerbound = [];
for i = 1:numel(h)
% Compute outlier bounds: box edges +/- (1.5 * IQR)
groups = findgroups(h(i).XData);
qtile.lower = splitapply(@(x)quantile(x,0.25),h(i).YData,groups);
qtile.upper = splitapply(@(x)quantile(x,0.75),h(i).YData,groups);
iqr = qtile.upper - qtile.lower;
upperbound = [upperbound; qtile.upper + 1.5*iqr]; %#ok<*AGROW>
lowerbound = [lowerbound; qtile.lower - 1.5*iqr];
ybound = [min(lowerbound), max(upperbound)];
% Set y axis limit
A note on data visualization
The chart above is misleading because it hides many outliers that appear to not exist. There are two ways to imrpove this so that your data visualization more accuratly depicts your data.
  1. Turn off outliers using set(h, 'MarkerStyle','none'). Note, this is not the same as detecting and removing outliers from your data before plotting. Also note that you'll still need to implement my solution to update the axis limits.
  2. Clearly indicate that some outliers are outside of the chart within your text.

More Answers (0)


Find more on Data Distribution Plots in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!