Finding mode of each row in an array of Strings

Question

Manas on 12 Aug 2024

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/2144784-finding-mode-of-each-row-in-an-array-of-strings

Answered: Steven Lord on 14 Aug 2024

Currently I have an array with 3 columns and a lot of rows (about 50,000). Each value is a string I essentially want to compare the 3 values in a row and find the most common.

Say my input table looked like the following

Apple Bannana Apple

Cherry Cherry Apple

Mango Mango Mango

My outputs would be

Apple

Cherry

Mango

Please let me know if there is any advice, I have tried mode but it does not work for strings.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Naga on 12 Aug 2024

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/2144784-finding-mode-of-each-row-in-an-array-of-strings#answer_1497954

Open in MATLAB Online

Dear Manas,

I understand you have a large array with 3 columns and many rows, where each value is a string. You want to find the most common string in each row and output these values. Here’s how you can do in MATLAB.

Define the sample data as a cell array.
Use 'arrayfun' to apply the 'mostcommon' function to each row of the data.
Output the results using disp.

% Sample data
data = {
    'Apple', 'Banana', 'Apple';
    'Cherry', 'Cherry', 'Apple';
    'Mango', 'Mango', 'Mango'
};
% Apply the function to each row and store results
mostCommonValues = arrayfun(@(i) mostCommon(data(i,:)), 1:size(data, 1), 'UniformOutput', false);
% Display the results
disp(mostCommonValues);
    {'Apple'}    {'Cherry'}    {'Mango'}
% Function to find the most common element in a cell array row
function commonValue = mostCommon(cellRow)
    [uniqueElements, ~, idx] = unique(cellRow);
    counts = accumarray(idx, 1);
    [~, maxIdx] = max(counts);
    commonValue = uniqueElements{maxIdx};
end

This approach should work efficiently even for large datasets like the one you mentioned with 50,000 rows.

Please refer to the below documentation to know more about the function 'arrayfun':

https://www.mathworks.com/help/matlab/ref/arrayfun.html

Hope this helps you!

1 Comment
Show -1 older commentsHide -1 older comments

Manas on 14 Aug 2024

This worked really well but do you know if there is anyway to make it so that I can ignore the blank cells if possible for example if it is ["Apple", "",""] it returns apple?

Sign in to comment.

Answer 2

Steven Lord on 14 Aug 2024

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/2144784-finding-mode-of-each-row-in-an-array-of-strings#answer_1499079

Open in MATLAB Online

If these strings represent data from one of several values in a category, consider storing the data as a categorical array.

str = ["Apple" "Banana" "Apple"; "Cherry" "Cherry" "Apple"; "Mango" "Mango" "Mango"];
C = categorical(str)
C = 3x3 categorical array
     Apple       Banana      Apple 
     Cherry      Cherry      Apple 
     Mango       Mango       Mango 

What fruits (categories) are present in C?

whichFruits = categories(C)
whichFruits = 4x1 cell array
    {'Apple' }
    {'Banana'}
    {'Cherry'}
    {'Mango' }

Can we ask for the most common category in each row?

M = mode(C, 2)
M = 3x1 categorical array
     Apple 
     Cherry 
     Mango 

Does this work even if there's a missing value in C?

C(2, 2) = missing
C = 3x3 categorical array
     Apple       Banana           Apple 
     Cherry      <undefined>      Apple 
     Mango       Mango            Mango 
mode(C, 2)
ans = 3x1 categorical array
     Apple 
     Apple 
     Mango 

Now in row 2, Apple and Cherry occur equally frequently, but Apple comes first in the list of categories so it's the mode. [Apple (pi) a la mode? ;)]

Can we figure out how many elements of each category are in each row?

[counts, fruit] = histcounts(C(1, :))
counts = 1x4
     2     1     0     0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
fruit = 1x4 cell array
    {'Apple'}    {'Banana'}    {'Cherry'}    {'Mango'}

or:

counts = countcats(C(1, :)) % No second output, returns counts in categories() order
counts = 1x4
     2     1     0     0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 3

Voss on 12 Aug 2024

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/2144784-finding-mode-of-each-row-in-an-array-of-strings#answer_1497949

Open in MATLAB Online

str = ["Apple" "Banana" "Apple"; "Cherry" "Cherry" "Apple"; "Mango" "Mango" "Mango"]
str = 3x3 string array
    "Apple"     "Banana"    "Apple"
    "Cherry"    "Cherry"    "Apple"
    "Mango"     "Mango"     "Mango"
N = size(str,1);
modes = strings(N,1);
for ii = 1:N
    [~,~,idx] = unique(str(ii,:));
    modes(ii) = str(ii,mode(idx));
end
disp(modes)
    "Apple"
    "Cherry"
    "Mango"

2 Comments
Show NoneHide None

Manas on 14 Aug 2024

This was helpful and works but using a for loop is slightly slower therefore was a bit impractical for me thanks though.

Voss on 14 Aug 2024

You're welcome!

arrayfun is also a for loop.

Sign in to comment.

Finding mode of each row in an array of Strings

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (2)

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Finding mode of each row in an array of Strings

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (2)

0 Comments Show -2 older commentsHide -2 older comments

2 Comments Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None