interpolate NaNs only if less than 4 consecutive NaNs

Hello,
I have a vector of datapoints containing some NaNs. I'd like to interpolate the NaNs only if there are 3 or less consecutive NaNs. i.e. interpolate over short datagaps but not long ones.
Any ideas would be welcome. Thanks

6 Comments

Please post a minimum working example.
http://www.mathworks.com/matlabcentral/answers/6200-tutorial-how-to-ask-a-question-on-answers-and-get-a-fast-answer
Are you asking for help with the interpolation or identifying short and long sequences of nans?
Oleg: say my vector is:
data = [ 3 4 5 6 NaN 5 4 3 NaN NaN NaN NaN 18 20 21 NaN NaN 24 23]
I want to interpolate across the NaNs when the number of consecutive NaNs is 3 or less.
Daniel: I can use interp1 to interpolate them all by:
data(isnan(data)) = interp1(find(~isnan(data)), data(~isnan(data)), find(isnan(data)), 'linear');
But am wondering if there is a way to discriminate between 'long' and 'short' runs of NaNs.
What I have done so far (since I posted the question) is:
% find the number of NaNs in each 'gap' in sequence:
notnan =find(~isnan(data));
gapsize = notnan(2: end,1) - notnan(1:end-1,1);
gapsize(gapsize==1)=[];
gapsize =gapsize -1;
% find the starting row for each gap:
for i = 1: length(data)-1;
if ~isnan(data(i)) && isnan(data(i+1))
nanstart(i) = i+1;
else
nanstart(i) = 0;
end
end; clear i
nanstart(nanstart ==0)=[];
% make a matrix with the start rows of the 'big' gaps
gaps = [gapsize nanstart' nanstart'+gapsize-1];
biggaps = gaps(gaps(:,1)>3,:)
Then I did this:
1) manually made an index of the gaps greater than 3,
2) interpolated the whole of x using interp1 as above
3) replaced the interpolated numbers across big gaps with NaN on the masis of my newly created index
I am trying to learn to use matlab better so I am wondering if there is a better way than this.
Please insert additional information by editing the original question instead of adding a comment.
Thanks Jan - seems I need to learn to use the forum as well as matlab!
what about if the NaN is 2 space(" ") can you solve?

Sign in to comment.

 Accepted Answer

Jan
Jan on 5 Apr 2012
Edited: Jan on 14 Sep 2013
% Interpolate all NaN blocks at first:
data = [ 3 4 5 6 NaN 5 4 3 NaN NaN NaN NaN 18 20 21 NaN NaN 24 23];
nanData = isnan(data);
index = 1:numel(data);
data2 = data;
data2(nanData) = interp1(index(~nanData), data(~nanData), index(nanData));
[EDITED, Jan, 15-Sep-2013] The following does not work!
% Clear the long NaN blocks afterwards:
longBlock = strfind(data, NaN(1, 4)); % Does this work?!
index2 = zeros(1, numel(data));
index2(longBlock) = 1;
index2(longBlock + 3) = -1;
index2 = cumsum(index2);
data2(index2(1:numel(data)) ~= 0) = NaN; % Catch trailing NaNs
There are some related submission in the FEX: FEX search: inpaintnan.

2 Comments

I'm trying to apply this for an array, where I only interpolate over values that have 4 or less NaNs in a single column. Any idea how to do that?
@Karolina: The original question has been "interpolate if I have 3 or less NaNs". Your question is "interpolate if I have 4 or less NaNs". The required changes to my code are tiny and trivial. Did you try it already?
What did you observe?
longBlock = strfind(data, NaN(1, 4))
This does not work. After the OP has accepted the answer, I did not check this anymore.

Sign in to comment.

More Answers (1)

Okay, here's a fun way to find the long sequences. You could interpolate the entire lot and then set the long sequences back to NaN. I'm using regexp because it's powerful =)
n = reshape(isnan(x), numel(x), 1); % ensure row-vector
[a, b] = regexp( char(n+'A'), 'B{4,}', 'start', 'end' );
This does string matching on sequences of 'B' (NaN) that are 4 characters or longer, and returns their start and end indices into the vectors a and b.
The nice thing about this is you can mess around with the regular expression to detect exactly what you want.
For example, to get only the indices of sequences with 3 or less NaNs, incorporating the non-NaN on either side, you would use:
'AB{1,3}A'
What you do with the indices is up to you.

2 Comments

The reshaping of x can be simplified: "n = x(:)". Although I confuse this frequently, I think that this is a column vector, not a row vector.
The REGEXP method is nice. +1
It should be possible to use "conv(isnan(x), ones(1,4))" also.
Oh yeah, I didn't think of just doing:
n = isnan(x(:));
I thought at the time: "okay I want isnan(x)(:) but I can't do that!"
Duhhh. =)

Sign in to comment.

Asked:

on 4 Apr 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!