is it possible to write a code that examine and analyze that a data in vector having 100 values, so recognize that if one or two very high as compare to other data?

1 view (last 30 days)
lets say i have
a = [ 1 2 1 1 2 25 1 2 1 23] % 2 Abnormal very high Values
b = [3 2 1 1 2 3 1 2 1 2] % Normal Data
i wish to process any vector of data but my code should examine first that data contain any abnormal high values or not? if it contains the "PEAKS" then it should first remove the peaks then plots the data
if i use "a" vector, we see that too values are very very high than other data values, i can find maximam of "a" via
c = max(a) % answer will come 25
d = c/2; % answer 12.5 or 12
a(a>d) = 2 % replace all number greater than 12 with 2.
in this case problem resolved.
if t = 0:0.1:10
i can use plot (t,a)
but if a data recieved like "b" vector shown above it has all values approximately nearby then it should not apply max formula and plot it same is it.
kindly guide how could i develop this logic ??? so single code should work for both type of data, actually if i fix a threshold like "12" for every data, may b recieved vector = [100 102 97 500 111 107 98] or [ [100 102 97 110 111 107 98] then now we have to handle the same way but threshold will fail here.
  1 Comment
taimour sadiq
taimour sadiq on 15 Oct 2020
Means all values of vector if nearby then its normal and can be use for plotting otherwise some peaks vales will suppress the other values on Graph.

Sign in to comment.

Accepted Answer

Vladimir Sovkov
Vladimir Sovkov on 15 Oct 2020
Edited: Vladimir Sovkov on 15 Oct 2020
This is known as the problem of locating outliers. Various approaches were proposed. E.g., the one based on the median absolute deviation (MAD criterion) was proved to be rather efficient and robust. See https://eurekastatistics.com/using-the-median-absolute-deviation-to-find-outliers/ and references therein.
  3 Comments
Vladimir Sovkov
Vladimir Sovkov on 15 Oct 2020
A sample code in your case
a = [1 2 1 1 2 25 1 2 1 23];
c = median(a);
d = mad(a,1);
b = a(abs(a-c)<5*d) % output vector with outliers excluded based on the MAD criterion
% outliers = a(abs(a-c)>=5*d) % uncomment if you need the outliers

Sign in to comment.

More Answers (1)

Image Analyst
Image Analyst on 15 Oct 2020
There are several "outlier" functions like isoutlier(), rmoutliers(), filloutliers(). Try this:
a = [ 1 2 1 1 2 25 1 2 1 23] % 2 Abnormal very high Values
b = [3 2 1 1 2 3 1 2 1 2] % Normal Data
aFixed = rmoutliers(a) % Remove 25 and 23
bFixed = rmoutliers(b) % No change because has no outliers.
You'll see
a =
1 2 1 1 2 25 1 2 1 23
b =
3 2 1 1 2 3 1 2 1 2
aFixed =
1 2 1 1 2 1 2 1
bFixed =
3 2 1 1 2 3 1 2 1 2
  12 Comments
taimour sadiq
taimour sadiq on 18 Oct 2020
Dear Vladimir ur Vision is very Honourable and highly Appreciated.. Once again More than thanks for ur Support. We all will jointly make this forum and software better.
Image Analyst
Image Analyst on 18 Oct 2020
Yes, thanks Vladimir. It's true that the main reward is the having the appreciation of the people you help. There are other rewards though. Keep contributing and we hope to see a Rising Star or MVP icon next to your name eventually (if people Accept or Vote for your Answers). ✨✨✨✨
One other nice benefit of this forum is learning about all the new functions that are added with every version. I have to admit, I don't read or remember the release notes that come out twice a year with each new version, and so mostly I learn about new functions from this forum.

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!