Number of occurrences per category per month using table data

1 view (last 30 days)
I am trying to use this table to figure out how many of the subCategory occur within a month time span. The data comes with date times such as below:
'01-Oct-2008'
'02-Oct-2008'
'07-Oct-2008'
'08-Oct-2008'
The time span is 137 months past Oct-2008 (AKA, Present Year)
What I would like to do is write a command that looks at the all the entries in the month of October 2008. Then tells me the number of occurrences that each of the sub-categories have. Once that processes is done, advance to the next month and repeat for the next 136 months. I figure a for loop that can step each month, but the datetime indexing wants the date and month. Perhaps removing the date and just having the month would do it? But how would the for loop advance one month? Below is how the raw table is currently being used in Matlab.
%% Raw Data Table
VariableNames = ["FirmName", "City", "State", "CountryArea", "InspectionEndDate", "ProgramArea", "CFRActNumber", "Category", "SubCategory", "SubsubCategory", "AbsenceofManagementStructure", "IncorrectImplementation", "ShortDescription", "EndofLongDescription", "LongDescription", "Characters"];
VariableTypes = ["categorical", "categorical", "categorical", "categorical", "datetime", "categorical", "categorical", "string", "categorical", "string", "string", "string", "string", "string", "string", "double"];
FullDataSet = table('size',[1,16],'VariableNames', VariableNames, 'VariableTypes',VariableTypes)
%% Categories
c ={'Design Quality','Employee Performance','Employee Training','Management for Quality','Management of Data','Performance Planning Process','Process Management','Quality Assessment','Supplier Quality','Support Services Management'};
SubCategoryValues = categorical (c);
I've tried to turn the table "FullDataSet" into a timeseries table, but I couldn't figure out how to get all the entries for a given month out. The resulting table I'm after looks like this:
%% Desired Table post processing
VariableNames2 = ["Month","Design Quality","Employee Performance","Employee Training","Management for Quality","Management of Data","Performance Planning Process","Process Management","Quality Assessment","Supplier Quality","Support Services Management"];
VariableTypes2 = ["datetime","double","double","double","double","double","double","double","double","double","double"];
InspectionsPerMonth = table('size',[1,11],'VariableNames', VariableNames2, 'VariableTypes',VariableTypes2)
I will then take 'InspectionsPerMonth' and make graphs from it.
I've tried ismember(, dateshift(, and a for loop. Due to the many different types of data in the table(FullDataSet), some of the indexing commands don't work.
Any articles or pointers that can help me would greatly appreciated.

Answers (1)

Seth Furman
Seth Furman on 28 Oct 2020
Check out the groupsummary function.
>> rng default
>> Weather = categorical(["rain","shine"])
Weather =
1×2 categorical array
rain shine
>> Weather = Weather(randi(2,1,62))';
>> t = timetable((datetime('01-Oct-2008'):days(1):datetime('01-Dec-2008'))',Weather);
>> head(t)
ans =
8×1 timetable
Time Weather
___________ _______
01-Oct-2008 shine
02-Oct-2008 shine
03-Oct-2008 rain
04-Oct-2008 shine
05-Oct-2008 shine
06-Oct-2008 rain
07-Oct-2008 rain
08-Oct-2008 shine
>> groupsummary(t,["Time","Weather"],["month","none"])
ans =
5×3 table
month_Time Weather GroupCount
__________ _______ __________
Oct-2008 rain 10
Oct-2008 shine 21
Nov-2008 rain 17
Nov-2008 shine 13
Dec-2008 rain 1

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!