How do I erase string duplicates in one column of a cell array and add the strings' corresponding numbers in the second column?

4 views (last 30 days)
I am working with a cell array. It consists of two columns. One has a name, and the second has a number that corresponds to/is associated with that name. This is a sample:
Sarah 12
Marie 3
Sam 5
However, there are many duplicate names in the first column, and I want to get rid of the duplicates, add their corresponding numbers, and send both the name and the new corresponding number to another array.
Below, I have code that, given any value of b, will output the truenumber corresponding to Name2.
However, I was wondering:
  1. How do I make this code run automatically for all values of b in the array?
  2. How do I send Name2 and send truenumber to new array?
fid = fopen('array.txt');
array = textscan(fid,'%s %s');
array{:};
fclose(fid);
a = 1
b = 50
while a < 21677
Name1 = array{1}{a}
Name2 = array{1}{b}
if isequal(Name1, Name2) == 1
X = str2num(array{2}{a})
Y = str2num(array{2}{b})
truenumber = X + Y
display(a)
display(array{1}{a})
display('This is the duplicate ^')
else
display('No duplicate, no change.')
end
a = a + 1
end
  3 Comments
dpb
dpb on 1 Aug 2020
It would be sitll easier if you would attach a .mat file, but...can get by. Altho the sample is pretty weak in that there's only one duplicated entry.

Sign in to comment.

Answers (1)

dpb
dpb on 1 Aug 2020
Edited: dpb on 1 Aug 2020
>> c
c =
6×2 cell array
{'Sarah' } {[12]}
{'Marie' } {[ 3]}
{'Sam' } {[ 5]}
{'Rose' } {[ 7]}
{'Sam' } {[ 6]}
{'Edward'} {[ 9]}
>> [u,~,ic]=unique(c(:,1))
u =
5×1 cell array
{'Edward'}
{'Marie' }
{'Rose' }
{'Sam' }
{'Sarah' }
ic =
5
2
4
3
4
1
>> histc(ic,unique(ic))
ans =
1
1
1
2
1
>>
The first answer is u, the unique names in the first column... ic returns the row of each unique name location in the original, and the last shows which group (4 -- 'Sam') has more than one member.
Use ic to pick the rows from c(:,2) to add for the associated data values by iterating over it.
(A clever one-line solution didn't come to me last night on doing it other than the loop altho there probably is one.)
ADDENDUM:
I was almost there last night; I just didn't do one thing right -- encapsulate the output of cat() in a cell.
[u,~,ic]=unique(c(:,1)); % optional--use 'stable' to return in origninal order
new=[u accumarray(ic,[c{:,2}],[],@(v) {cat(1,v)})];
yields
>> new
new =
5×2 cell array
{'Edward'} {[ 9]}
{'Marie' } {[ 3]}
{'Rose' } {[ 7]}
{'Sam' } {2×1 double}
{'Sarah' } {[ 12]}
>>
>> new{4,:}
ans =
'Sam'
ans =
6
5
>>
shows does put the right values where wanted/needed...
NB: Above returns in sorted alphabetic order of the names; the 'stable' option on unique would retain the existing order if that were to be significant.
  4 Comments
dpb
dpb on 2 Aug 2020
Ah well...that's why I made one very trivial and wrote it down on a sticky on the old Mathworks membership card they used lo! so many years ago! :)
I'm pretty sure there's a way to retrieve or reset on an original account, but I don't know offhand exactly how that would be--I can't even recall if the Answers forum uses the same as the login account or can make another just for it...

Sign in to comment.

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!