MATLAB Answers

Osnofa
0

how to Interpolate hourly data

Asked by Osnofa
on 28 Mar 2019
Latest activity Commented on by Osnofa
on 2 Apr 2019
Hello everyone,
I have a large hourly data set (~15 years of data) for several measuring points. for the purpose I'll add an example data file with 4 locations.
the format is [year month day hour data1 data2 data3 data4]
Missing records are registered as NaN.
I need to interpolate if the amount of missing records is less or equal to 3 (for each day). If the amount of missing records is greater than 3 (for each day) then the day remains untouched.
What would be the best way to do this? I'm stuck on this, probably is not that hard but I'm not a regular matlab user and therefore that might be the reason I'm struggling to find a way to do this.
thank you for the attention.

  0 Comments

Sign in to comment.

2 Answers

Answer by Akira Agata
on 1 Apr 2019
 Accepted Answer

Hmm, interesting problem.
I think the following is one possible straight-forward solution.
% Read data file
D = dlmread('test.txt');
Time = datetime(D(:,1),D(:,2),D(:,3)) + hours(D(:,4));
% Convert to timetable
T = [table(Time),...
array2table(D(:,5:end),'VariableNames',{'Data1','Data2','Data3','Data4'})];
TT = table2timetable(T);
% Num of NaN for each day, each column
TT2 = retime(TT,'daily',@(x) nnz(isnan(x)));
% Apply linear interpolation if num of NaN is (0,3] per day.
for nDay = 1:height(TT2)
for nVar = 1:4
if TT2{nDay,nVar} > 0 && TT2{nDay,nVar} <= 3
idx = isbetween(TT.Time,TT2.Time(nDay),TT2.Time(nDay)+days(1));
TTseg = fillmissing(TT(idx,nVar),'linear');
TT(idx,nVar) = TTseg;
end
end
end

  3 Comments

Akira Agata
on 1 Apr 2019
By the way, in my code, I have assumed 'hour = -0.5' as a 'minus 30 minutes'. So, (Y,M,D,H) = (2001, 1, 2, -0.5) was treated as "2001/1/1 23:30", for example.
Osnofa
on 1 Apr 2019
Yes, the timestamp has that format. I'll take a look to your solution later in the night.
thanks in advance.
Osnofa
on 2 Apr 2019
It works fine, thanks for the help!

Sign in to comment.


Answer by Andrei Bobrov
on 1 Apr 2019
Edited by Andrei Bobrov
on 1 Apr 2019

T = readtable('test.txt','ReadVariableNames',false,...
'Format','%d %d %d %f %f %f %f %f');
time1 = datetime(T{:,1:3}) + hours(T{:,4});
TT = table2timetable(T(:,5:end),'RowTime',time1);
TT.Properties.VariableNames = sprintfc('data%d',1:4);
TT_out = varfun(@fun,TT);
function out = fun(x)
bw = isnan(x);
N = accumarray(bwlabel(bw)+1,1);
N = N(2:end);
lo = ismember(bw,find(N > 3));
x(lo) = 0;
out = fillmissing(x,'linear');
out(lo) = nan;
end

  0 Comments

Sign in to comment.