Finding and saving identical rows in a matrix

1 view (last 30 days)
Hi, Suppose I have an (m*n) matrix A, e.g.
A=[
3 2 3.5
2 2 3.5
4 2 3.5
2 3 3.5
3 3 3.5
4 3 3.5
2 4 3.5
3 4 3.5
4 4 3.5
3 2 4.5
2 2 4.5
4 2 4.5
2 3 4.5
3 3 4.5
4 3 4.5
2 4 4.5
3 4 4.5
4 4 4.5
3 2 5.5
2 2 5.5
4 2 5.5
2 3 5.5
3 3 5.5
4 3 5.5
2 4 5.5
3 4 5.5
4 4 5.5
12 22 7.5
13 16 8.9];
Also I made an image of Matrix A below. There are rows that the first and second members in these rows are identical to the first and second members in other rows respectively, for example
[A(1,1) , A(1,2)]==[A(10,1) , A(10,2)]==[A(19,1) , A(19,2)]
that were highlighted in orange color in the above image, also the other identical members in the first and second columns in different rows are highlighted in same colors.
The 28th and 29th rows that have no identical rows in the matrix A didn't highlighted with colors.
I want to find a way to save these identical rows with their members in all columns in different new matrices separately.
We have 9 different colors here, so we must have 9 New_A matrices.
Also the 28th and 29th rows don't have identical rows in matrix A, so I want to save the 28th and 29th rows in a single new matrix for example matrix named B.
At the end I want to reach these matrices that are shown below
New_A1=[3 2 3.5;
3 2 4.5;
3 2 5.5];
New_A2=[2 2 3.5;
2 2 4.5;
2 2 5.5];
New_A3=[4 2 3.5;
4 2 4.5;
4 2 5.5];
New_A4=[2 3 3.5;
2 3 4.5;
2 3 5.5];
New_A5=[3 3 3.5;
3 3 4.5;
3 3 5.5];
New_A6=[4 3 3.5;
4 3 4.5;
4 3 5.5];
New_A7=[2 4 3.5;
2 4 4.5;
2 4 5.5];
New_A8=[3 4 3.5;
3 4 4.5;
3 4 5.5];
New_A9=[4 4 3.5;
4 4 4.5;
4 4 5.5];
B=[12 22 7.5;
13 16 8.9];
I was wondering if anyone has any idea on how to do that? thank you for your help.
And is there any code to tell how many identical rows and how many different rows are in the matrix A ?
for example in matrix A there are 9 identical rows and 2 different rows

Accepted Answer

Cedric
Cedric on 4 Nov 2017
Alternatively:
[~, ~, ic] = unique( A(:,1:2), 'rows' ) ;
groups = splitapply( @(x){x}, A, ic ) ;
produces
groups =
11×1 cell array
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{1×3 double}
{1×3 double}
and then
isAlone = cellfun( @(x) size(x,1), groups ) == 1 ;
merged = vertcat( groups{isAlone} ) ;
groups = groups(~isAlone) ;
where groups is the cell array of all groups of rows that are not unique, and merged is a merge of all others.
  7 Comments
Cedric
Cedric on 4 Nov 2017
Edited: Cedric on 4 Nov 2017
A cell array is an array of cells. Variable groups is a cell array (just one), and it contains nine cells. Each cell contains a numeric array (a matrix).
Block-indexing a cell array (the usual indexing with parentheses) returns cells (or a single cell), not their/its content. Block-indexing a cell array hence returns another cell array (a block of the original cell array).
>> groups(1)
ans =
1×1 cell array
{3×3 double}
>> class( groups(1) )
ans =
'cell'
Usually we need to access cells' content, however. This is done using curly-bracket indexing:
>> groups{1}
ans =
2.0000 2.0000 3.5000
2.0000 2.0000 4.5000
2.0000 2.0000 5.5000
>> class( groups{1} )
ans =
'double'
So groups(1) is cell 1 of the groups cell array, and groups{1} is its content.
The easiest way to work with your data is to keep the cell array. It is already well suited for iterating through groups for example:
for gId = 1 : numel( groups )
disp( det( groups{gId} )) ;
end
which you could not do if you had saved groups in e.g.
A = groups{1} ;
B = groups{2} ;
...

Sign in to comment.

More Answers (1)

per isakson
per isakson on 3 Nov 2017
Edited: per isakson on 4 Nov 2017
I have an idea and that's (the two leftmost columns contain whole numbers)
>> [C,ia,ic] = unique(A(:,1:2),'rows');
>> A(ic==1,:)
ans =
2.0000 2.0000 3.5000
2.0000 2.0000 4.5000
2.0000 2.0000 5.5000
where there is more than one occurrence of 1 in ic
Loop over all numbers with more than one occurrence in ic
Those with one occurrence make up B
Finally, "New_A9" forces me to refer you to TUTORIAL: Why Variables Should Not Be Named Dynamically (eval) (appropriate smiley)
.
Implementation with releases older than R2015b (see Cedrics answer for newer releases)
[~,ia,ic] = unique(A(:,1:2),'rows');
N = histc( ic, 0.5+(0:length(ia)) );
N = reshape( N, 1, [] );
new = cell( 1, sum(N>=2) );
B = nan( sum(N==1), 3 );
ix = 0;
for jj = find( N>=2 )
ix = ix + 1;
new{ix} = A( ic==jj, : );
end
ix = 0;
for jj = find( N==1 )
ix = ix + 1;
B(ix,:) = A( ic==jj, : );
end
It reproduces the output of your example. Might need more testing.
>> whos new B
Name Size Bytes Class Attributes
B 2x3 48 double
new 1x9 1656 cell
and
>> new{4}
ans =
3.0000 2.0000 3.5000
3.0000 2.0000 4.5000
3.0000 2.0000 5.5000
>> B
B =
12.0000 22.0000 7.5000
13.0000 16.0000 8.9000
  5 Comments
per isakson
per isakson on 4 Nov 2017
Edited: per isakson on 4 Nov 2017
"change these cell arrays into double" If all the cells contain matrices of the same size you can convert the cell array into a double array. In the example all the cells contain 3x3 matrices.
mr mo
mr mo on 4 Nov 2017
Edited: mr mo on 4 Nov 2017
Thanks again for your help. I have a question. This code is written for Matrix A. Can I use this code for a New Matrix with new size and new members that are different from Matrix A ?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!