compare with a plot the distribution of my data to the extreme value (Gumbel) distribution
20 views (last 30 days)
I am trying to make a plot which compare the distribution of my data to a Gumbel distribution. Basically, I am trying to do by hand what the probplot function does. Here is my code:
data=[2.35 2.75 2.75 2.82 2.85 2.9 2.9 3.05 3.2 3.4 3.41 3.9]*10000; %my data
h=probplot('extreme value',data) %this is the plot I want to obtain
[f,x] = ecdf(data);%Empirical cumulative distribution
parmhat = evfit(data)%Extreme value parameter estimates (type 1 = Gumbel)
% fitdist could have been used to
plot(x,f,'.b')%I would prefer to plot(data,F(data)) but data and x have different sizes, why?
plot(data,(data-mu)/sigma,'r') %and this line does not fit the data
evinv(0.01,mu,sigma)%100 year return value
I suspect something is wrong with the (data-mu)/sigma and that I should be plotting something else. Could you please have a look?
Ingrid on 10 Apr 2015
calling the help for ecdf this indicates that the function returns the empirical cumulative distribution function (cdf), f, evaluated at the points in x, using the data in the vector y. In your case data and x have different sizes because your data has points with duplicate values i.e. 2.75 occurs twize and 2.9 also. For a given x there can only be one probability so therefore the size of x is smaller than the size of data and you cannot use plot(data,f) and even with the same size would not make much sense given the explanation given above.
indeed there is a mistake in plotting (data-mu)/sigma, since the mu and sigma are not for a normal distribution in this case but for the Gumbel distribution so you should use the cumulative distribution equation for the Gumbel distribution
AI, Data Science, and Statistics Statistics and Machine Learning Toolbox Probability Distributions Exploration and Visualization