Compare two CDF distributions

Question

MEC on 16 Mar 2023

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/1930025-compare-two-cdf-distributions

Commented: Jeff on 16 Mar 2023

I am trying to compare two CDF distributions that are generated from two datasets of elevation. One dataset is observed elevations from a DEM (HeightDis.txt), the other is predicted elevations from a model (ModelHeight.txt). I want to generate a goodness of fit for how well the model is matching observed elevations.

I tried to use ktest2 but for that they need to be vectors. My two distributions are two-column matrices. The first column is the elevation value, the second column is the probability. The two distributions have different values in both columns. So my question is how do I covert these two distributions into a format that can be used in ktest2 without comprimising the data? I feel that this is an obvious problem, but have not found a solution.

3 Comments
Show 1 older commentHide 1 older comment

MEC on 16 Mar 2023

Apologies. Digital Elevation Model.

Jeff on 16 Mar 2023

Does the model have any free parameters that you are estimating from these observed data?

Sign in to comment.

Sign in to answer this question.

Answer 1

Star Strider on 16 Mar 2023

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/1930025-compare-two-cdf-distributions#answer_1194570

Open in MATLAB Online

I am not sure that either of those tests would be appropriate for these data.

A1 = readmatrix('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1326675/ModelHeight.txt');

A2 = readmatrix('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1326680/HeightDis.txt');

figure

plot(A1(:,1), A1(:,2), '.', 'DisplayName','Model')

hold on

plot(A2(:,1), A2(:,2), '.', 'DisplayName','Observed')

hold off

grid

legend('Location','best')

pdf1 = gradient(A1(:,2)) ./ gradient(A1(:,1));

pdf2 = gradient(A2(:,2)) ./ gradient(A2(:,1));

figure

plot(A1(:,1), pdf1, '.-', 'DisplayName','Model')

hold on

plot(A2(:,1), pdf2, '.-', 'DisplayName','Observed')

hold off

grid

legend('Location','best')

They do not appear to be normally distributed in any event, although assuming that they have the same underlying distribution (whatever it is), perhaps the ranksum test (if these could be considered unpaired data) would be appropriate,, however on the original data, not the probability distributions.

.

2 Comments
Show NoneHide None

MEC on 16 Mar 2023

Thank you for the comment. I wanted to use a KS test for easy comparison with another model output, which is also a KS test. But your point is a good one. I also wanted to avoid using the "data" that comes out of the model because it would require some more time-intensive coding that I wished to avoid. Plus it seemed this should, in theory, have been an easy thing to do, which clearly it is proving not to be.

Star Strider on 16 Mar 2023

My pleasure!

Based on the PDF plots, the data appear to not be normally distributed, so I doubt that it would be worthwhile to test for that, although if that is part of your analysis, then it could be appropriate to consider. If you want to compare the model to the data to see if the model explains the data, a completely different approach would be required. That the independent variables are not the same definitely complicates any analysis.

Sign in to comment.

Compare two CDF distributions

3 Comments
Show 1 older commentHide 1 older comment

Answers (1)

2 Comments
Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Compare two CDF distributions

3 Comments Show 1 older commentHide 1 older comment

Answers (1)

2 Comments Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

3 Comments
Show 1 older commentHide 1 older comment

2 Comments
Show NoneHide None