Finding unique values in a table without losing the rest of the data
272 views (last 30 days)
Show older comments
Philipp Henschel
on 30 Nov 2017
Commented: Elizabeth Reese
on 24 Apr 2019
I'm working on a table right now, where the rows are equal except for one column, like this (Departure column has "updated" daytimes):
ID = [153; 153; 153; 153; 124; 124; 124; 211; 211];
Start = {'A';'A';'A';'A';'A';'A';'A';'A';'A'};
End = {'B';'B';'B';'B';'B';'B';'B';'B';'B'};
Departure = datetime({'2014-06-30 12:12'; '2014-06-30 12:15'; '2014-06-30 12:14'; '2014-06-30 12:16'; '2014-06-30 14:32'; '2014-06-30 14:30'; '2014-06-30 14:31'; '2014-06-30 17:31'; '2014-06-30 17:42'});
A = table(ID, Start, End, Departure)
Now I'm trying to use the unique feature to reduce the table, so that I only have each ID once, but without losing the rest of the information of the table, like i would if i just enter
C = unique(A.ID);
This way I receive a 3x1 table/vector and the rest of the information is lost.
The Documentation says for this case (I asume): C = unique(A,vars) returns a dataset that contains only one observation for each unique combination of values for the variables in A specified in vars. vars is a positive integer, a vector of positive integers, a variable name, a cell array containing one or more variable names, or a logical vector. C includes all variables from A. The values in C for the variables not specified in vars are taken from the last occurrence among observations in A with each unique combination of values for the variables specified in vars.
Using this explanation:
C = unique(A, ID);
I received following Error Message: Unrecognized variable name 'Adaptor'.
Please help, if you have any ideas to solve my problem. Thanks
3 Comments
Stephen23
on 30 Nov 2017
"Now I'm trying to use the unique feature to reduce the table, so that I only have each ID once, but without losing the rest of the information of the table, "
Please explain what you expect to happen with the rest of the data. How would you compress >1 rows into just one row?
Accepted Answer
Elizabeth Reese
on 5 Dec 2017
If you want to keep the data from the rows with unique IDs, you can do that using the indexes that unique returns.
In your case,
[C,IA,IC] = unique(A.ID);
B = A(IA,:)
You can specify the setOrder or occurrence in unique to determine which Departure date is selected and the order of the IDs in the new table.
2 Comments
QIAO WANG
on 24 Apr 2019
Hi, thank you so much. I benefit a lot from this answer. Besides, I've got another issue when I play with setOrder and occurrence.
By default, [C,IA,IC] = unique(A.ID); will use 'sorted' and 'first'. However, I want to use 'stable' and 'last' at the same time so that I can get the last occurance of that element in the same order as in the original data. But I got an error, which makes me confused because I assume this is very basic. Sorry, I may explain my question unclearly. I'll show an example here.
>> A
A =
9×4 table
ID Start End Departure
___ _____ ___ ____________________
153 'A' 'B' 30-Jun-2014 12:12:00
153 'A' 'B' 30-Jun-2014 12:15:00
153 'A' 'B' 30-Jun-2014 12:14:00
153 'A' 'B' 30-Jun-2014 12:16:00
124 'A' 'B' 30-Jun-2014 14:32:00
124 'A' 'B' 30-Jun-2014 14:30:00
124 'A' 'B' 30-Jun-2014 14:31:00
211 'A' 'B' 30-Jun-2014 17:31:00
211 'A' 'B' 30-Jun-2014 17:42:00
What I want to obtain is like this (manually made this table). I assumed I could get this table by simply using [C,IA,IC] = unique(A.ID,'stable','last'); B = A(IA,:); However, it didn't work. "You cannot specify 'stable' and 'sorted' with 'first' and 'last'."
B =
3×4 table
ID Start End Departure
___ _____ ___ ___________________
153 'A' 'B' 30-Jun-2014 12:16:00
124 'A' 'B' 30-Jun-2014 14:31:00
211 'A' 'B' 30-Jun-2014 17:42:00
So, can I ask is there any way that can figure this out? I've got a very complicated method which I don't think is efficient.
Elizabeth Reese
on 24 Apr 2019
Please refer to this other MATLAB Answers post regarding "stable" and "last", as specifiying both the "setOrder" and "setOccurence" is not currently supported. There is a workaround on this post and we have recorded the enhancement request for future consideration.
More Answers (1)
Peter Perkins
on 19 Dec 2017
It's pretty hard to tell what you mean by, "without losing the rest of the information of the table", but an alternative to Amy's suggestion would be to use rowfun, with ID as the grouping variable, and somehow combine all the data from each set of rows that share the same ID.
0 Comments
See Also
Categories
Find more on Whos in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!