How can I delete NaN cell from a cell array?
Show older comments
I have this cell array, how do I get rid of cells that contain NaN?
10 Comments
Fangjun Jiang
on 26 Jun 2020
utilize isnan()
the cyclist
on 26 Jun 2020
Edited: the cyclist
on 26 Jun 2020
Can you please help us help you, and give a more detailed description of exactly what you want?
For example, I see that
array_dati{1}.X(5000,1) = NaN NaN 2512.35067448876 NaN
Should we get rid of that entire row?
Should we get rid of that row for all other variables in array_dati{1}?
Should we get rid of that row for all other variables for all other elements in array_dati?
Can we assume that if there is a NaN in one variable (in one structure of one cell), all the corresponding variables in all structures of all cells will have a NaN?
James Tursa
on 26 Jun 2020
As usual, it really helps if you post a small example showing input and desired output.
Angela Marino
on 26 Jun 2020
Edited: Angela Marino
on 26 Jun 2020
the cyclist
on 26 Jun 2020
Hm.
It seems that every row of
array_dati{1}.X
has at least one NaN.
Angela Marino
on 26 Jun 2020
Walter Roberson
on 26 Jun 2020
Why are you not removing the nan before doing the kmeans?
Angela Marino
on 26 Jun 2020
dpb
on 27 Jun 2020
I think you've got a real problem here...
>> array_dati{1}
ans =
struct with fields:
X: [7200×4 double]
Y: [7200×4 double]
lat: [7200×4 double]
lon: [7200×4 double]
per: [7200×4 double]
v: [7200×4 double]
dist: [7200×4 double]
sim: 0
>>
>> sum(any(isnan(array_dati{1}.X),2))
sum(any(isnan(array_dati{1}.Y),2))
sum(any(isnan(array_dati{1}.lat),2))
sum(any(isnan(array_dati{1}.lon),2))
sum(any(isnan(array_dati{1}.per),2))
sum(any(isnan(array_dati{1}.v),2))
sum(any(isnan(array_dati{1}.dist),2))
sum(all(isnan(array_dati{1}.X),2))
sum(all(isnan(array_dati{1}.Y),2))
ans =
7200.00
ans =
7200.00
ans =
7200.00
ans =
7200.00
ans =
7200.00
ans =
7200.00
ans =
7200.00
ans =
2626.00
ans =
2626.00
>>
Every row has at least one NaN so if you delete the observation because one variable is missing in the row, then you have nothing left.
>> sum(all(isnan(array_dati{1}.Y),2))
ans =
4521.00 4524.00 4523.00 4525.00
>>
There are 4520+/- rows that are nothing but NaN; that leaves 7200-4520 --> ~2680 with at least one observation.
I didn't count the distribution of number of finite by row.
I also don't know otomh the ramification of trying kmeans with missing variables nor how the MATLAB routine handles it.
But, it's simple-enough to eliminate the all NaN rows in favor of keeping those with any--
>> array_dati{1}.X=array_dati{1}.X(any(isfinite(array_dati{1}.X),2),:);
>> array_dati{1}
ans =
struct with fields:
X: [4574×4 double]
Y: [7200×4 double]
lat: [7200×4 double]
lon: [7200×4 double]
per: [7200×4 double]
v: [7200×4 double]
dist: [7200×4 double]
sim: 0
>>
Perhaps the choice would be to select finite observations from the columns of the array and ignore the rows? We don't know what any of it is or means, so it's pretty-much an "anything goes!" approach.
Angela Marino
on 28 Jun 2020
Answers (0)
Categories
Find more on Data Type Identification in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!