# How to plot confidence bounds for a theoretical cumulative distribution function?

6 views (last 30 days)
Eric-Jan Scharlee on 5 Aug 2021
Commented: Jeff Miller on 6 Aug 2021
I understand how to plot upper and lower confidence bounds for an experimental cumulative distribution function using the ecdf function.
But how to plot upper and lower confidence bounds for a theoretical cumulative distribution like for example the Theoretical CDF in the plot shown below? (copied from: https://www.mathworks.com/help/stats/cdfplot.html)
Paul on 6 Aug 2021
I don't know if there is a way to do what you want; but I'm far from an expert on such things. Having said that, my intuition is that fitting a distribution is an exercise in estimating the parameters of the distribution, and that's why fitdist only returns the CI's around the parameters, i.e., it's only those parameters that are being estimated. In contrast, the ecdf() function is estimating something at each value in the xdata, so there it seems reasonable to come up with a CI around each estimate, which is what the dotted curves are in that plot. I'll be interested to see other answers to your question.

Jeff Miller on 6 Aug 2021
For a given X value, the theoretical cumulative probability is p = F(X). Suppose you have a sample of N observations and you let k be the number of observations <= X. k is (by definition) binomial(N,p) with the known N and that theoretical p. Using that binomial distribution, you can get upper and lower confidence limits on the observed k (e.g., with a normal approximation to the binomial). Then divide those upper and lower lilmits on k by N and you will have upper and lower confidence limits on p for that X value.
Jeff Miller on 6 Aug 2021
Thanks for the clarification, @Paul. I misunderstood the original question as pertaining to a known theoretical distribution (i.e., with known parameter values). No, the process I am describing does not apply to fitted distributions.
@Eric-Jan Scharlee As Paul says, CIs around parameter estimates are standard, but CIs around a fitted CDF are not. I am not even sure how that would be defined. I suppose you could generate a range of CDFs with different combinations of parameter values within the parameter CIs and take the extremes of those (at each X) as some kind of theoretical CI, but that's really ad hoc. To me, it seems better to focus on the CIs from the ecdf.