Calculate longterm daily average

9 views (last 30 days)
Tunechi
Tunechi on 28 Mar 2020
Commented: Tunechi on 16 Apr 2020
I manged to do monthly average like this.
mat = randi([0 14600], 730, 5, 2);
sc1 = (datetime(1981,1,1):datetime(1982,12,31))';
splitmat = num2cell(permute(mat, [1 3 2]), [1 2]);
TT1 = timetable(sc1,splitmat{:});
monthlyaverage = retime(TT1, 'monthly', 'mean');
If I do
monthlyaverage = retime(TT1, 'daily', 'mean');
it will return the same value as the data is daily. However, I am interested to get longterm daily average that means average of day of the year (day 1 ... day 365). The final answer number of rows will be only 365. is there similar approach that is concise like retime to do it?
thanks

Accepted Answer

Adam Danz
Adam Danz on 28 Mar 2020
Edited: Adam Danz on 28 Mar 2020
Create timetable (from OP's code)
mat = randi([0 14600], 730, 5, 2);
sc1 = (datetime(1981,1,1):datetime(1982,12,31))';
splitmat = num2cell(permute(mat, [1 3 2]), [1 2]);
TT1 = timetable(sc1,splitmat{:});
Compute averages for day-of-year using grpstats().
% List all data columns
varList = TT1.Properties.VariableNames(2:end);
% Compute day-of-year and add it to the table
TT1.DOY = day(TT1.sc1, 'dayofyear');
% Compute mean for each day-of-year
% This requires converting the TimeTable to a Table
% This also requires Stats & Machine Learning Toolbox
dailyAvg = grpstats(timetable2table(TT1(:,2:end)),'DOY','mean','DataVars',varList)
First few rows of output
head(dailyAvg)
DOY GroupCount mean_Var2 mean_Var3 mean_Var4 mean_Var5
___ __________ ________________ ________________ ________________ ________________
1 1 2 6272 2320.5 9136.5 10590 5050.5 10498 8312.5 9643
2 2 2 12622 987.5 6968 4419 2484 7055.5 11215 9381.5
3 3 2 6032 1531 3696 1633.5 11304 8133 10159 7516
4 4 2 9188 1711 1905 7979 10044 3948 7382 6632
5 5 2 8311 10974 9007 11422 7507 3849.5 3333 6290.5
6 6 2 14211 4030.5 10858 4806.5 8750 7859 4807 3746
7 7 2 9090.5 5260 5890.5 9528 11296 4518.5 7327.5 10649
8 8 2 7022.5 11904 3766 12161 12543 8624.5 5090 10506
  3 Comments
Adam Danz
Adam Danz on 29 Mar 2020
Two changes:
1) check out the whichstats input to grpstats().
dailyAvg = grpstats(. . .,{'mean','max','min'}, . . .);
2) Using varfun() to to reshape the [n x 2] table columns into [n*2 x 1].
% List all data columns
varList = TT1.Properties.VariableNames(2:end);
% Compute day-of-year and add it to the table
TT1.DOY = repmat(day(TT1.sc1, 'dayofyear'),1,2);
% Convert timetable to table and condense 2 columns of data into 1
T = varfun(@(x)x(:), TT1, 'OutputFormat', 'Table');
% Remove the "Fun_" added to each column header
T.Properties.VariableNames = strrep(T.Properties.VariableNames, 'Fun_', '');
% Compute mean for each day-of-year
% This requires converting the TimeTable to a Table
% This also requires Stats & Machine Learning Toolbox
dailyAvg = grpstats(T,'DOY',{'mean','max','min'},'DataVars',varList);
Tunechi
Tunechi on 30 Mar 2020
Awesome! that will do it all.

Sign in to comment.

More Answers (2)

Peter Perkins
Peter Perkins on 14 Apr 2020
grpstats is one good way to do this if you have Stats&MachineLearning.
But you can do it with varfun or groupsummary as well, using the same strategy as Adam suggests.
>> TT1.DoY = day(TT1.sc1,'dayofyear');
>> T1 = timetable2table(TT1,'ConvertRowTimes',false);
>> T2 = varfun(@mean,T1,'GroupingVariable','DoY');
>> head(T2)
ans =
8×7 table
DoY GroupCount mean_Var1 mean_Var2 mean_Var3 mean_Var4 mean_Var5
___ __________ ________________ ________________ ________________ ________________ ________________
1 2 10438 8102 11394 4847 13280 7349.5 13210 2185.5 5699.5 10844
2 2 9078.5 8809 7330.5 2759.5 11094 5387.5 4578.5 2849.5 8809.5 13464
3 2 8121 7738 3814 5890.5 5404.5 10058 11780 2442 1112 12268
4 2 5612.5 3806 5117 8904.5 6604 7341 3184.5 6626.5 9570 5559.5
5 2 6093.5 5553.5 7815 12010 4659.5 6154 4931.5 9718.5 2887 13360
6 2 1902.5 11517 7212.5 11689 1130.5 13010 9429.5 1615.5 5625 12540
7 2 9129.5 6234 5818.5 6273.5 5322.5 10574 7396 10446 4645.5 2084
8 2 6514.5 6962 8030.5 8349 5101.5 4168.5 4347 8066 7185 9954.5
I'm not sure what "There are two columns for each variable and I want to calculate further mean, max, min for each variable, so that each varaible has one column. " means. Looks like Adam took a guess; here's another version of what you might mean:
>> myStats = @(x) table(min(x),mean(x),max(x),'VariableNames',["Min" "Mean" "Max"]);
>> T3 = varfun(myStats,T1,'GroupingVariable','DoY');
>> T3.Properties.VariableNames(3:end) = T1.Properties.VariableNames(1:end-1);
>> head(T3)
ans =
8×7 table
DoY GroupCount Var1 Var2 Var3 Var4 Var5
Min Mean Max Min Mean Max Min Mean Max Min Mean Max Min Mean Max
___ __________ ____________________________________________________ ___________________________________________________ ____________________________________________________ __________________________________________________ ____________________________________________________
1 2 5652 5728 8205.5 6448 10759 7168 3 1195 5312.5 4397.5 10622 7600 4796 3623 8123.5 6121 11451 8619 2844 7602 5682 8898 8520 10194 10601 6939 11759 7606 12917 8273
2 2 2653 1579 6510 7762.5 10367 13946 8684 12151 9920 12770 11156 13390 1842 4333 5323.5 6877.5 8805 9422 6809 387 7563.5 6334.5 8318 12282 8473 6773 11304 10028 14136 13283
3 2 11129 1613 12238 7777 13348 13941 2305 8224 8213 10870 14121 13516 3820 6160 6309 8872 8798 11584 6995 5266 8761.5 5270.5 10528 5275 956 8014 2138.5 10728 3321 13442
4 2 10051 157 11331 3859 12611 7561 9944 9023 12086 9775 14227 10527 4330 1056 8877.5 5038 13425 9020 3752 732 5752.5 6905.5 7753 13079 10472 7690 11450 9158.5 12427 10627
5 2 7081 3286 10387 5216 13693 7146 5511 2060 7554 5440.5 9597 8821 4009 2059 8771 4367.5 13533 6676 5714 5455 5938 6779 6162 8103 1651 984 7647 5459 13643 9934
6 2 6493 2541 7784 8212 9075 13883 6355 1816 7134.5 3316 7914 4816 10561 9122 11506 9829.5 12451 10537 7503 1181 9165 7584.5 10827 13988 6184 2750 7839.5 6737 9495 10724
7 2 3763 2671 7344 8058.5 10925 13446 5067 9451 5189.5 11540 5312 13630 11140 7072 11671 7354 12202 7636 5815 1175 8379.5 3274.5 10944 5374 5863 1981 6072.5 5714 6282 9447
8 2 321 602 5984 5098 11647 9594 5128 1387 9528 6335.5 13928 11284 84 4613 1138 9348 2192 14083 3886 5994 4033.5 7790 4181 9586 2217 6212 5625 8895 9033 11578
That's a table that contains 5 tables, each of which contains three variables, each of which itself has two columns. Or maybe you don't want nested tables. Here's a table with a separate variable for each stat on original variable.
>> T4 = splitvars(T3);
>> head(T4)
ans =
8×17 table
DoY GroupCount Var1_Min Var1_Mean Var1_Max Var2_Min Var2_Mean Var2_Max Var3_Min Var3_Mean Var3_Max Var4_Min Var4_Mean Var4_Max Var5_Min Var5_Mean Var5_Max
___ __________ ______________ ________________ ______________ _____________ ________________ ______________ ______________ ________________ ______________ ____________ ________________ ______________ ______________ ________________ ______________
1 2 5652 5728 8205.5 6448 10759 7168 3 1195 5312.5 4397.5 10622 7600 4796 3623 8123.5 6121 11451 8619 2844 7602 5682 8898 8520 10194 10601 6939 11759 7606 12917 8273
2 2 2653 1579 6510 7762.5 10367 13946 8684 12151 9920 12770 11156 13390 1842 4333 5323.5 6877.5 8805 9422 6809 387 7563.5 6334.5 8318 12282 8473 6773 11304 10028 14136 13283
3 2 11129 1613 12238 7777 13348 13941 2305 8224 8213 10870 14121 13516 3820 6160 6309 8872 8798 11584 6995 5266 8761.5 5270.5 10528 5275 956 8014 2138.5 10728 3321 13442
4 2 10051 157 11331 3859 12611 7561 9944 9023 12086 9775 14227 10527 4330 1056 8877.5 5038 13425 9020 3752 732 5752.5 6905.5 7753 13079 10472 7690 11450 9158.5 12427 10627
5 2 7081 3286 10387 5216 13693 7146 5511 2060 7554 5440.5 9597 8821 4009 2059 8771 4367.5 13533 6676 5714 5455 5938 6779 6162 8103 1651 984 7647 5459 13643 9934
6 2 6493 2541 7784 8212 9075 13883 6355 1816 7134.5 3316 7914 4816 10561 9122 11506 9829.5 12451 10537 7503 1181 9165 7584.5 10827 13988 6184 2750 7839.5 6737 9495 10724
7 2 3763 2671 7344 8058.5 10925 13446 5067 9451 5189.5 11540 5312 13630 11140 7072 11671 7354 12202 7636 5815 1175 8379.5 3274.5 10944 5374 5863 1981 6072.5 5714 6282 9447
8 2 321 602 5984 5098 11647 9594 5128 1387 9528 6335.5 13928 11284 84 4613 1138 9348 2192 14083 3886 5994 4033.5 7790 4181 9586 2217 6212 5625 8895 9033 11578
Or maybe a separate variable in the table for each stat on each column of each original variable, but this seems rather cumbersome.
>> T5 = splitvars(T4);
>> head(T5)
ans =
8×32 table
DoY GroupCount Var1_Min_1 Var1_Min_2 Var1_Mean_1 Var1_Mean_2 Var1_Max_1 Var1_Max_2 Var2_Min_1 Var2_Min_2 Var2_Mean_1 Var2_Mean_2 Var2_Max_1 Var2_Max_2 Var3_Min_1 Var3_Min_2 Var3_Mean_1 Var3_Mean_2 Var3_Max_1 Var3_Max_2 Var4_Min_1 Var4_Min_2 Var4_Mean_1 Var4_Mean_2 Var4_Max_1 Var4_Max_2 Var5_Min_1 Var5_Min_2 Var5_Mean_1 Var5_Mean_2 Var5_Max_1 Var5_Max_2
___ __________ __________ __________ ___________ ___________ __________ __________ __________ __________ ___________ ___________ __________ __________ __________ __________ ___________ ___________ __________ __________ __________ __________ ___________ ___________ __________ __________ __________ __________ ___________ ___________ __________ __________
1 2 5652 5728 8205.5 6448 10759 7168 3 1195 5312.5 4397.5 10622 7600 4796 3623 8123.5 6121 11451 8619 2844 7602 5682 8898 8520 10194 10601 6939 11759 7606 12917 8273
2 2 2653 1579 6510 7762.5 10367 13946 8684 12151 9920 12770 11156 13390 1842 4333 5323.5 6877.5 8805 9422 6809 387 7563.5 6334.5 8318 12282 8473 6773 11304 10028 14136 13283
3 2 11129 1613 12238 7777 13348 13941 2305 8224 8213 10870 14121 13516 3820 6160 6309 8872 8798 11584 6995 5266 8761.5 5270.5 10528 5275 956 8014 2138.5 10728 3321 13442
4 2 10051 157 11331 3859 12611 7561 9944 9023 12086 9775 14227 10527 4330 1056 8877.5 5038 13425 9020 3752 732 5752.5 6905.5 7753 13079 10472 7690 11450 9158.5 12427 10627
5 2 7081 3286 10387 5216 13693 7146 5511 2060 7554 5440.5 9597 8821 4009 2059 8771 4367.5 13533 6676 5714 5455 5938 6779 6162 8103 1651 984 7647 5459 13643 9934
6 2 6493 2541 7784 8212 9075 13883 6355 1816 7134.5 3316 7914 4816 10561 9122 11506 9829.5 12451 10537 7503 1181 9165 7584.5 10827 13988 6184 2750 7839.5 6737 9495 10724
7 2 3763 2671 7344 8058.5 10925 13446 5067 9451 5189.5 11540 5312 13630 11140 7072 11671 7354 12202 7636 5815 1175 8379.5 3274.5 10944 5374 5863 1981 6072.5 5714 6282 9447
8 2 321 602 5984 5098 11647 9594 5128 1387 9528 6335.5 13928 11284 84 4613 1138 9348 2192 14083 3886 5994 4033.5 7790 4181 9586 2217 6212 5625 8895 9033 11578
  3 Comments
Adam Danz
Adam Danz on 15 Apr 2020
@Tunechi , another approach is to use splitvars() to put each sub-column into its own column and then vertically concatenate the paired columns. I think that approach is a little more tricky than the one I suggested but I just wanted you to be aware of that function since you're dealing with sub-column tables.

Sign in to comment.


Ameer Hamza
Ameer Hamza on 28 Mar 2020
Following code shows one of the way.
mat = randi([0 14600], 730, 5, 2);
sc1 = (datetime(1981,1,1):datetime(1982,12,31))';
splitmat = num2cell(permute(mat, [1 3 2]), [1 2]);
TT1 = timetable(sc1,splitmat{:});
monthlyaverage = retime(TT1, 'monthly', 'mean');
[~,m,d] = ymd(TT1.sc1);
values = TT1.Variables;
values_ = zeros(365, size(values,2));
for i=1:size(values, 2)
values_(:,i) = accumarray(findgroups(m,d), values(:,i), [], @mean);
end
values_ = mat2cell(values_, 365, 2*ones(size(TT1,2),1));
date_new = datetime(1981,1,1):datetime(1981,12,31);
TT_new = timetable(date_new', values_{:});

Categories

Find more on Tables in Help Center and File Exchange

Tags

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!