## how to Interpolate hourly data

### Osnofa (view profile)

on 28 Mar 2019
Latest activity Commented on by Osnofa

### Osnofa (view profile)

on 23 May 2019
Accepted Answer by Andrei Bobrov

### Andrei Bobrov (view profile)

Hello everyone,
I have a large hourly data set (~15 years of data) for several measuring points. for the purpose I'll add an example data file with 4 locations.
the format is [year month day hour data1 data2 data3 data4]
Missing records are registered as NaN.
I need to interpolate if the amount of missing records is less or equal to 3 (for each day). If the amount of missing records is greater than 3 (for each day) then the day remains untouched.
What would be the best way to do this? I'm stuck on this, probably is not that hard but I'm not a regular matlab user and therefore that might be the reason I'm struggling to find a way to do this.
thank you for the attention.

### Andrei Bobrov (view profile)

Answer by Andrei Bobrov

### Andrei Bobrov (view profile)

on 1 Apr 2019
Edited by Andrei Bobrov

### Andrei Bobrov (view profile)

on 1 Apr 2019

'Format','%d %d %d %f %f %f %f %f');
time1 = datetime(T{:,1:3}) + hours(T{:,4});
TT = table2timetable(T(:,5:end),'RowTime',time1);
TT.Properties.VariableNames = sprintfc('data%d',1:4);
TT_out = varfun(@fun,TT);
function out = fun(x)
bw = isnan(x);
N = accumarray(bwlabel(bw)+1,1);
N = N(2:end);
lo = ismember(bw,find(N > 3));
x(lo) = 0;
out = fillmissing(x,'linear');
out(lo) = nan;
end

Osnofa

### Osnofa (view profile)

on 23 May 2019
Hello,
I endend up using the solution bellow but now I was looking at yours again. Yours is faster, however I have a problem trying to use it. If I have something else after your code I get an error:
"Function definitions in a script must appear at the end of the file.
Move all statements after the "fun" function definition to before the first local function definition."
What should I do in order to use the result for other applications?
thank you.

### Akira Agata (view profile)

Answer by Akira Agata

### Akira Agata (view profile)

on 1 Apr 2019

Hmm, interesting problem.
I think the following is one possible straight-forward solution.
% Read data file
Time = datetime(D(:,1),D(:,2),D(:,3)) + hours(D(:,4));
% Convert to timetable
T = [table(Time),...
array2table(D(:,5:end),'VariableNames',{'Data1','Data2','Data3','Data4'})];
TT = table2timetable(T);
% Num of NaN for each day, each column
TT2 = retime(TT,'daily',@(x) nnz(isnan(x)));
% Apply linear interpolation if num of NaN is (0,3] per day.
for nDay = 1:height(TT2)
for nVar = 1:4
if TT2{nDay,nVar} > 0 && TT2{nDay,nVar} <= 3
idx = isbetween(TT.Time,TT2.Time(nDay),TT2.Time(nDay)+days(1));
TTseg = fillmissing(TT(idx,nVar),'linear');
TT(idx,nVar) = TTseg;
end
end
end

Akira Agata

### Akira Agata (view profile)

on 1 Apr 2019
By the way, in my code, I have assumed 'hour = -0.5' as a 'minus 30 minutes'. So, (Y,M,D,H) = (2001, 1, 2, -0.5) was treated as "2001/1/1 23:30", for example.
Osnofa

### Osnofa (view profile)

on 1 Apr 2019
Yes, the timestamp has that format. I'll take a look to your solution later in the night.