Finding column values for each unique combination of two other columnar values in a table
5 views (last 30 days)
Show older comments
I have a large tabular data with three columns X, Y and Z.
I want to find unique values of Z for each unique combination of X and Y. All three columns have non unique data values. If multiple Z values for a unique (X,Y) combination exist, then I want the minimum Z value.
My thought on getting this was using
unique_X = unique(T.X);
T(~ismember(T.X_data, unique_X(i) ),:) = [ ];
And I loop it for X and Y individually and reset variable each loop, but I think there should be a much easier way to go at this. Can someone help me on this?
0 Comments
Answers (1)
dpb
on 19 Oct 2022
% build a dataset
XYZ=randi(1000,30,3);
U=arrayfun(@(i)unique(XYZ(:,i)),1:3,'uni',0);
N=min(cellfun(@numel,U));
tXYZ=array2table(cell2mat(cellfun(@(v)v(1:N),U,'UniformOutput',0)),'variablenames',{'X','Y','Z'});
% the engine
tUbyGroup=groupsummary(tXYZ,{'X','Y'},@max); tUbyGroup.Properties.VariableNames(end)={'Max_Z'};
head(tUbyGroup)
Check if works if have unique in each column but a duplicated value from X in Y...
tXYZ.X(3)=tXYZ.X(2); % now there are two of that group...
tXYZ.Y(3)=tXYZ.Y(2); % now there are two of that group...
tUbyGroup=groupsummary(tXYZ,{'X','Y'},@max); tUbyGroup.Properties.VariableNames(end)={'Max_Z'};
head(tUbyGroup)
And, voila!!! Indeed the second group has a count of two and the Max_Z value is the larger of Z(2:3) which are the two we duplicated.
0 Comments
See Also
Categories
Find more on Matrices and Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!