Select columns from a matrix by treshold value

Hi,
i have some data stored in a 7200*600 matrix (600 could also be 300 or 1000 or something in this range). Somewhere in the middle there are some relevant columns (about 20) which contains higher values than the other. I would like to extract these columns for some calculation (row mean only for these columns etc.). So far i do this by:
M = mean(data);
B = (M > min(M)*1.1); % Treshold of 110 %
data_new = B.*data;
data_new(:,all(data_new == 0))=[]; % Removes columns if the entire column is zero
Then i can calculate the mean like:
data_new_mean = mean(data_new(:,3:end),2); % First two columns are suspect data
This works fine so far but i'm looking for a way to get the indices and work with the original matrix instead of building a new one. In the next step i have to extract the data in the following columns (If the relevant columns before were 200, 210, 220... for example, i need 201, 211, 221... in the next step) and that's why the first way isn't convenient anymore. Do you have some ideas?

3 Comments

We need to know that data does not have negative values in it -- or at least that the mean() of each column is definitely positive.
If there are any columns with mean() that is negative, then min(M) is going to be negative, and negative times 1.1 is more negative, which is probably not what you want.
If not then if there are any columns of data that are all-zero, then their mean() is 0, and 0*1.1 is 0, which is probably not what you want.
I am trying to work out of your B.*data can zero out a column "accidentally", knocking out positive values but leaving zeros. I believe the answer to that is NO. But it is possible for an entire non-zero column to be knocked out. For example if the column were all 50, mean() of it would be 50, that might happen to be the min(), threshold would then be 55, but 50 < 55 in all positions, so the column could get knocked out.
... A bunch of this is trying to figure out the consequences if there are negative values or 0 already in the data, which is something we are not told.
Hi,
all values are positive. The question is if there is a possibility to get the indices of the columns which contain values above the treshold. It should me give something like indices(desiredcolumns) = x, y, z, ... and then calculate the row mean only with these columns so that i can do this in the next step with x+1, y+1, z+1...
M = mean(data);
B = (M > min(M, 1)*1.1);
mask = any(M >= B, 1);
mask can now be used as a column index.

Sign in to comment.

 Accepted Answer

Thank you all,
i just did it via
Y = circshift(B,-1);
which also worked fine for me. Then i could use Y insteat of B to select the columns x+1...

More Answers (1)

Matt J
Matt J on 30 Aug 2021
Edited: Matt J on 30 Aug 2021
find( any(M > min(M)*1.1 ,1) )

2 Comments

That would find the columns that contain only data less than the threshold; the user wanted to find columns that contain at least one datapoint greater than the threshold.

Sign in to comment.

Categories

Find more on Graph and Network Algorithms in Help Center and File Exchange

Products

Release

R2019b

Asked:

on 29 Aug 2021

Answered:

on 13 Oct 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!