treating NaN as a unique value (instead of as a distinct)

Question

antonet on 2 Jul 2012

1
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/42561-treating-nan-as-a-unique-value-instead-of-as-a-distinct

Accepted Answer: Kye Taylor

Open in MATLAB Online

In

http://www.mathworks.de/help/techdoc/ref/unique.html

there is the following example:

Unique Values in Array Containing NaNs

A = [5 5 NaN NaN];

C = unique(A)

C =

     5   NaN   NaN
unique treats NaN values as distinct.

IS it possible to treat NaN as a unique value so at to have

C=5 NaN

1 Comment
Show -1 older commentsHide -1 older comments

Malcolm Lidierth on 2 Jul 2012

NaNs are treated as "bigger" than +Inf by Arrays.sort() in Java and by MATLAB unique/sort too by the look of it so you can just trim the output array.

If MATLAB NaN does not return a constant NaN bit pattern (it probably does), java.lang.Double.NaN will do. But NaNs are NaNs so each is treated as unique even if the bit pattern is the same.

Sign in to comment.

Sign in to answer this question.

Answer 1

Kye Taylor on 2 Jul 2012

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/42561-treating-nan-as-a-unique-value-instead-of-as-a-distinct#answer_52368

Edited: Kye Taylor on 2 Jul 2012

Open in MATLAB Online

Write this function...

function y = myUnique(x)
  y = unique(x);
  if any(isnan(y))
    y(isnan(y)) = []; % remove all nans
    y(end+1) = NaN; % add the unique one.
  end
end

1 Comment
Show -1 older commentsHide -1 older comments

antonet on 2 Jul 2012

thank you!

Sign in to comment.

Answer 2

Walter Roberson on 2 Jul 2012

7
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/42561-treating-nan-as-a-unique-value-instead-of-as-a-distinct#answer_52371

Open in MATLAB Online

Stealing ideas and optimizing...

function y = myUnique(x)
  y = unique(x);
  y(isnan(y(1:end-1))) = [];
end

5 Comments
Show 3 older commentsHide 3 older comments

Walter Roberson on 2 Jul 2012

Note to people trying to understand my code: it makes use of an obscure trick. When you index with a logical vector, the vector you are indexing with can be shorter than the vector being indexed. The "missing" logical values are treated as false. By only applying isnan() to the elements up to one before the end of the vector, I prevent the last element of the vector from being tested for NaN, so I am preventing that last element from being deleted. This has the effect of preserving one NaN from being deleted.

Kye Taylor on 3 Jul 2012

That is cool.

Sign in to comment.

Answer 3

James Tursa on 2 Jul 2012

2
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/42561-treating-nan-as-a-unique-value-instead-of-as-a-distinct#answer_52385

Edited: James Tursa on 3 Jul 2012

Open in MATLAB Online

Assuming the input A is double class and all the NaN values have the same underlying bit pattern (which seems to be true of the MATLAB functions):

C = typecast(unique(typecast(A,'uint64')),'double');

If you are working with single class variables then:

C = typecast(unique(typecast(A,'uint32')),'single');

The above code has two extra data copies involved. If you don't want to absorb the time/resource penalty of these data copies, you can use my TYPECASTX function from the FEX which returns a shared data copy of the input:

C = typecastx(unique(typecastx(A,'uint64')),'double');

If you are working with single class variables then:

C = typecastx(unique(typecastx(A,'uint32')),'single');

TYPECASTX can be found here:

http://www.mathworks.com/matlabcentral/fileexchange/17476-typecast-and-typecastx-c-mex-functions

---------------------------------

WARNING -- WARNING:

---------------------------------

The above code should not be used because the UNIQUE function does not work with uint64 and int64 class inputs. I am leaving my post here for reference, but do not use the above code. See the discussion in the comments below.

4 Comments
Show 2 older commentsHide 2 older comments

James Tursa on 3 Jul 2012

Open in MATLAB Online

FOLLOW-UP:

After stepping though the UNIQUE code, it turns out the result differences are not because of changes in the UNIQUE code itself, but in the underlying double -- uint64 conversions behind the scenes. That is, both the R2010a (and prior) versions and R2010b (and later) versions of UNIQUE do the conversions to/from double inside the code without regard to input class (which seems to be a bug to me in both cases). But the conversion code itself has apparently changed. I get different answers with the following:

format hex
-inf
typecast(ans,'uint64')
double(ans)
uint64(ans)

On R2010a and earlier, a nearby number (likely the result of some rouding scheme in the background) is produced, not the original number. On R2010b and later, the original number is reproduced. It may be that if I had used a different number to start with, the R2010a and earlier versions would reproduce it and the R2010b and later would not. I don't know. I haven't had the time to fully test this out yet, and don't know the extent of the uint64 and int64 arithmetic/conversion changes that were made.

This begs the question, however, of how many other MATLAB functions have conversions to/from double in the background that would render them buggy when used with uint64 or int64 class data.

The above was done on a 32-bit WinXP machine.

James Tursa on 3 Jul 2012

Edited: James Tursa on 3 Jul 2012

FOLLOW-UP #2:

As I suspected, it wasn't hard to come up with uint64 numbers that did not work for R2010b and later. Bottom line is UNIQUE is buggy for uint64 and int64 class inputs in all versions of MATLAB as far as I can tell because of the underlying silent conversion to / from double.

Sign in to comment.

Answer 4

Sean de Wolski on 2 Jul 2012

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/42561-treating-nan-as-a-unique-value-instead-of-as-a-distinct#answer_52370

Open in MATLAB Online

Replace the NaNs with an obscure number that you check to make sure is not present first. This will give you the full functionality of unique

function [u, ia, ic] = nanunique(varargin)
  x = varargin{1};
  t = rand;
  while any(x(:)==t)
      t = rand;
  end
  x(isnan(x)) = t;
  [u, ia, ic] = unique(x,varargin{2:end});
  u(u==t)=nan;
end

Then call it with something like:

nanunique([5 5 2 7 nan nan 5])

2 Comments
Show NoneHide None

Sean de Wolski on 2 Jul 2012

NOTE, this gives full functionality for double precision numbers, I am not checking for cellstrs or other wierd things but this could be done easily.

antonet on 2 Jul 2012

thank you Sean!

Sign in to comment.

treating NaN as a unique value (instead of as a distinct)

1 Comment
Show -1 older commentsHide -1 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (3)

5 Comments
Show 3 older commentsHide 3 older comments

4 Comments
Show 2 older commentsHide 2 older comments

2 Comments
Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

treating NaN as a unique value (instead of as a distinct)

1 Comment Show -1 older commentsHide -1 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (3)

5 Comments Show 3 older commentsHide 3 older comments

4 Comments Show 2 older commentsHide 2 older comments

2 Comments Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

1 Comment
Show -1 older commentsHide -1 older comments

5 Comments
Show 3 older commentsHide 3 older comments

4 Comments
Show 2 older commentsHide 2 older comments

2 Comments
Show NoneHide None