Plot Mean over Box Charts using Positional and Color Grouping Variables

12 views (last 30 days)
I'd like to plot mean over box chart which using positional and Color Grouping Variables.
Example)
tbl = readtable('TemperatureData.csv');
monthOrder = {'January','February','March','April','May','June','July', ...
'August','September','October','November''December'};
tbl.Month = categorical(tbl.Month,monthOrder);
boxchart(tbl.Month,tbl.TemperatureF,'GroupByColor',tbl.Year)
ylabel('Temperature (F)')
legend
hold on
meanTemperatureF = groupsummary(tbl.TemperatureF,{tbl.Month, tbl.Year}, 'mean');
plot(meanmeanTemperatureF,'o')
but, the positions of plot are not exactly.
How can I fix it?
  3 Comments
KK
KK on 16 Nov 2021
I'm sorry to miswrite.
Here's the right code.
plot(meanTemperatureF,'o')
And, the below picture is output of this code.
dpb
dpb on 16 Nov 2021
Well, you've got twelve months but for some reason only 11 categories on the boxplot axis and there are only 10+7 actual boxes by the time you've done the grouping by year.
Then you just plotted the means for an indeterminate number of months versus their ordinal positions.
You need to find/create the correct categorical variable value for each of those means that is associated with the position of the appropriate box -- and then since you have used grouping, that may still not quite line up.
Never done such an exercise before and suspect highly unlikely anybody on Answers has, either; attach your data as a .mat file so somebody can reproduce the result you have and easily explore how to make the necessary adjustments to axis values.
"Help us help you!"

Sign in to comment.

Accepted Answer

Dave B
Dave B on 16 Nov 2021
Edited: Dave B on 16 Nov 2021
Boxchart makes this really difficult! The location of the categories is well defined, but the offset (while easy to calculate) isn't included anywhere. Fortunately, you can plot the numeric equivalent of a categorical, and it's easy to convert.
Some bits of your code didn't quite line up for me (e.g. how you're calling group summary) so I used a dataset I happened to have around with very similar data. I plotted the means on each box, not sure if you were thinking the mean for each month - which would be much easier!
Note that I used the more robust 'ruler2num' to convert month names to their numeric values, but in reality the locations are just the category number, so the month number.
tbl = readtable('natick weather 2003-2014.csv');
tbl.Year=tbl.DATE.Year;
tbl=tbl(ismember(tbl.Year,[2004,2008,2012]),:);
%monthOrder = {'January','February','March','April','May','June','July', ...
% 'August','September','October','November', 'December'};
% alternate move:
monthOrder = month(datetime(2010,1:12,1),'name');
tbl.Month = categorical(month(tbl.DATE,'name') ,monthOrder);
meantemp = groupsummary(tbl,{'Month' 'Year'},'mean','TMAX');
%%
bc=boxchart(tbl.Month,tbl.TMAX,'GroupByColor',tbl.DATE.Year);
ylabel('Temperature (F)')
legend
hold on
xax=get(gca,'XAxis');
offset=(1:numel(bc))/numel(bc);
offset=offset-mean(offset);
for i = 1:numel(bc)
ind = string(meantemp.Year)==string(bc(i).DisplayName);
x=ruler2num(meantemp.Month(ind),xax)+offset(i);
y=meantemp.mean_TMAX(ind);
plot(x,y,'x','LineWidth',2,'DisplayName',"mean(" + bc(i).DisplayName + ")",'SeriesIndex',i)
end
  6 Comments
Dave B
Dave B on 19 Nov 2021
@dpb -
I definitely get the pain, and I'm certainly frustrated when I can't give people a way to (e.g.) change the fontsize on their heatmap title. I shouldn't have said 'power users', I couldn't think of a better term. Most folks who use MATLAB graphics won't do anything with the objects (the handle part of handle graphics) without getting advice from places like ML Answers...so that's sort of what I was thinking about.
I agree this is an area where we really need to improve. We definitely don't limit our functionality because we're trying to dumb it down or limit what you can do, that's the opposite of what (most) MW developers want. But we do need to find solutions to resolve issues with our big (and growing) ecosystem (and company) and cross-release compatibility, so that we can open up access without it being buggy. These are hard problems but I'm optimistic that we'll get there!
dpb
dpb on 19 Nov 2021
Thanks for the feedback...part of my purpose in Answers (besides being entertainment/stimulation after giving up the consulting gig) is that it gives the opportunity to raise these kinds of user pain points.
I know I tend to carp on a lot of details and may take a thread off on a side journey but I always try to make sure the OPs Q? is answered best as can on the way. :)
But, I think having these related types of similar cases raised hopefully will continue to raise the consciousness of the development team -- I know it probably isn't true, but it seems to me as a longer-time user a trend towards releasing features that are not yet really ready and that there is far less consistency across the base product and toolboxes than before. It seems as though there isn't an overall corporate-wide oversight that really enforces syntax rules/documentation to try to maintain that cohesive nature but that the various toolboxes are almost totally separate products.
I understand the difficulties; the shift from purely procedural coding style of the original MATLAB to object-oriented/class-based methods is a major dichotomy and schism to breach. I don't have the answer (so to speak?), but believe there needs to be more effort into the area during the initial design of new functions/features/toolboxes to try to minimize these differences going forward.

Sign in to comment.

More Answers (0)

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!