Daily average of several years data

5 views (last 30 days)
JAIME DIEGO RICO on 9 Jul 2020
Edited: jonas on 10 Jul 2020
Hi!
I have SST data from 1981-2019. I'd like to calculate the daily average for each day of the year.
So the value for each day is the average of the SST of the 39 years.
The structure of the data is like this:
sst_date=[day,month,year,sst];
I'm thinking about looping sst_date, but is very confusing because the months and some years have different days.
Is there any way easier?
Thank you

Show 1 older comment
JAIME DIEGO RICO on 9 Jul 2020
1 1 1982 13.4691935954243
2 1 1982 13.4287935963273
3 1 1982 13.4183935965598
4 1 1982 13.9135935854912
5 1 1982 13.9483935847134
6 1 1982 13.9511935846508
7 1 1982 13.9471935847402
8 1 1982 14.0023935835064
9 1 1982 13.9711935842037
10 1 1982 13.8475935869664
madhan ravi on 9 Jul 2020
? It only contains a year
JAIME DIEGO RICO on 9 Jul 2020
Hi, no, the serie continues until 2019.
So i'd like to create a climatological year, so calculate the mean sst for 1st of Jan, 2nd od Jan, 3th of Jan.... until 31st of December

Cris LaPierre on 9 Jul 2020
Edited: Cris LaPierre on 9 Jul 2020
If you can turn your data into a table, use the groupsummary function. Assuming you want to average days from each year, you could do this:
dailyAvg = groupsummary(dataTbl,["monthVar","dayVar"],"mean","sst")
If you actually convert your date into a datetime variable, you could also do this:
groupsummary(dataTbl,"date","dayofyear","mean","sst")
Here's the full code using the snippet of data you provided above.
data = [1 1 1982 13.4691935954243
2 1 1982 13.4287935963273
3 1 1982 13.4183935965598
4 1 1982 13.9135935854912
5 1 1982 13.9483935847134
6 1 1982 13.9511935846508
7 1 1982 13.9471935847402
8 1 1982 14.0023935835064
9 1 1982 13.9711935842037
10 1 1982 13.8475935869664];
dataTbl = table(data);
dataTbl = splitvars(dataTbl,"data","NewVariableNames",["day","month","year","sst"])
% option one - group by month then day.
dailyAvg1 = groupsummary(dataTbl,["month","day"],"mean","sst")
% option two - use the groupbin "dayofyear" on datetime variable "date"
dataTbl.date = datetime(fliplr(data(:,1:3)));
dailyAvg2 = groupsummary(dataTbl,"date","dayofyear","mean","sst")

1 Comment

JAIME DIEGO RICO on 10 Jul 2020
Thank you very much Cris, it really worked fine!!!

jonas on 9 Jul 2020
Edited: jonas on 10 Jul 2020
This will probably work
[~,~,G] = unique([SST(:,1),SST(:,2)],'rows')
out = splitapply(@mean,SST(:,4),G);
You should end up with one value for each day of the year

JAIME DIEGO RICO on 10 Jul 2020
Thank you Jonas, that works fine, it's a similar process but for only 31 days, I'd like to do the same with the 365 days of the year
jonas on 10 Jul 2020

Kelly Kearney on 9 Jul 2020
You might also take a look at reshapetimeseries.m (part of the Climate Data Toolbox). Setting the 'bin' option to 'date' will reshape data into a year x day-of-year matrix (even if there are days without data), and takes care of the messiness associated with leap days.

1 Comment

JAIME DIEGO RICO on 10 Jul 2020
i'll do, thanks