# Identify first and last indicies of numbers in an array containing NaN's

7 views (last 30 days)
Nev Pires on 6 Aug 2020
Commented: Stephen23 on 7 Aug 2020
I have an array containing numbers and NaN's. The numbers and NaN's are interspersed with each other:
M = [NaN, 1.3, 4, 5.2, 3, NaN, NaN, NaN, 4.6, 2, 6.2, 3, 2, NaN, 7, 3.2, 5, NaN, NaN, NaN, 12.1 ,6.8];
I want to extract the first and last indicie of each section containing real numbers. In this example it would be
startIndex = [2, 9, 15, 21];
endIndex = [5, 13, 17, 22];
I know I can find the index of each non NaN value by:
index = find(~isnan(M));
index = [2, 3, 4, 5, 9, 10, 11, 12, 13, 15, 16, 17, 21, 22];
The first derivative would help me identify the break points in the index array:
idx = find(diff(index));
idx = [1, 1, 1, 4, 1, 1, 1, 1, 2, 1, 1, 4, 1];
I'm now stuck on how to use that to get the data that I want.

Stephen23 on 6 Aug 2020
Your original idea of using diff is exactly the simple and efficient solution that experienced MATLAB users would use:
>> M = [NaN,1.3,4,5.2,3,NaN,NaN,NaN,4.6,2,6.2,3,2,NaN,7,3.2,5,NaN,NaN,NaN,12.1,6.8];
>> D = diff([true;isnan(M(:));true]);
>> B = find(D<0)
B =
2
9
15
21
>> E = find(D>0)-1
E =
5
13
17
22
##### 2 CommentsShowHide 1 older comment
Stephen23 on 7 Aug 2020
"Could you please explain how this line works?"
Break it down into its constituent pieces:
D = diff([true;isnan(M(:));true]);
M(:) % convert M to column vector (optional)
isnan( ) % logical vector of NaN locations
[true; ;true] % concatenate TRUE to ensure edge-cases are detected
diff( ) % differences between adjacent logical values
A simple example should help too:
>> M = [1;2;NaN;3;NaN];
>> D = diff([true;isnan(M(:));true])
D =
-1
0
1
-1
1
0
Note that the start of the first number sequence would not be detected without the concatenated true values.

Sudheer Bhimireddy on 6 Aug 2020
Try this:
% Instead of diff, just loop through the indices to see the order and group them
your_nan_indices = [2, 3, 4, 5, 9, 10, 11, 12, 13, 15, 16, 17, 21, 22];
nInd = numel(your_nan_indices);
start_ind(1) = your_nan_indices(1);
j = 2; k = 1;
for i = 2:nInd-1
if your_nan_indices(i) ~= your_nan_indices(i+1)-1
start_ind(j) = your_nan_indices(i+1); j = j + 1;
end_ind(k) = your_nan_indices(i); k = k + 1;
else
end_ind(k) = your_nan_indices(i+1);
end
end
Stephen23 on 6 Aug 2020
Edited: Stephen23 on 6 Aug 2020
"is there any advantage of using numel vs length?"
length changes the dimension that it measures depending on the size of the provided array, which can lead to unexpected bugs. The robust alternatives are to use numel for the total number of elements in an array, and size for the size of any one particular dimension.