Trouble with spikes.

1 view (last 30 days)
Paulo Eduardo Beiral
Paulo Eduardo Beiral on 3 Oct 2019
Commented: Rik on 8 Oct 2019
Hi all.
I've got a matrix called "turb" 2112x2688, which contains many outliers values. What I need is to remove those outliers values, and I tought about the folowing statistical method:
  1. Find the mean and standart deviation values of which one of the 2688 columns;
  2. Use two alternatives: Remove all the values in the column higher than it's respective mean value and Remove all the values in the column higher than it's respective standart deviation value;
  3. Replace these removedvalues with NaN.
Using the mean value of the matrix as a whole wouldn't fit well for my result, because the given mean value is too low and would remove too many components of the matrix, that's why using the mean value of which column fits better.
If anyone could give me any advice of how could I start building up this kind of script, I would be really thankfull!!
Regards, Paulo Beiral.

Accepted Answer

Rik
Rik on 3 Oct 2019
Here you can see the power of Matlab. What you want can be in a few short lines of code:
x=rand(2112,2688);
avg=mean(x);%or explicitly: avg=mean(x,1)
%sd=std(x);%or explicitly: sd=std(x,1)
L= x<=avg;%for R2016b and later
%L=bsxfun(@le,x,avg);%for R2016a and earlier
x(L)=NaN;
  2 Comments
Paulo Eduardo Beiral
Paulo Eduardo Beiral on 8 Oct 2019
Edited: Paulo Eduardo Beiral on 8 Oct 2019
Hi Rik, thanks for the answer!!!
I tried increasing what you said in my script, but it didn't work well...the image which I ploted from it didn't work as well...
A20032642003354 up until A20182642018455 is the name of the 16 files which I use, and they got the variables "LON, LAT, turb". I believe that my script of the avrg and std must relate the variable "turb", which I tried to increase in the script that you sent, but didn't have success.
diret = '/data/processamento_beiral/geral/4-primavera';
cd (diret);
files = './*.mat';
dir (files);
arrumado = dir(files);
primavera = [264 265];
%% Primavera
for i = 1:length(arrumado)
if str2double(arrumado(i).name(6:8)) >= primavera(1) && str2double(arrumado(i).name(6:8)) <= primavera(2)
Tturb(i) = load(arrumado(i).name);
turb_2 = nanmean(turb);
TURB = rand(turb_2);
avg = mean(TURB);%or explicitly: avg=mean(x,1)
%sd=std(x);%or explicitly: sd=std(x,1)
%L= TURB<=avg;%for R2016b and later
L = bsxfun(@le,TURB,avg);%for R2016a and earlier
TURB(L) = NaN;
end
end
Tturb = nanmean(TURB,3);
cd(strcat((diret), '/Final'))
save A20032642018354 Tturb LON LAT
clearvars -except LON LAT diret files arrumado primavera
cd (diret);
Rik
Rik on 8 Oct 2019

What have you tried to track down the issue? I don't see a strong indication that there is anything wrong with the code itself, except that you're possibly overwriting your variable inside the loop.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!