# Chi squared test to test if data is from same distribution

9 views (last 30 days)
John on 8 Feb 2013
Hello,
I have recorded some discrete data with an unspecified distribution.
I have generated some discrete data from a model.
I looking to check to see if the generated data has the same distribution as the real data.
If the data was continious, I would use a Q-Q plot and a striaght line would indicate that it is true.
As the data is discrete, I need another test.
I was thinking a chi-squared test would be suitbale?
Would Matlab have such a function? I would be grateful if somebody could perhaps demonstrate an example?
kind regards

José-Luis on 8 Feb 2013
You could use a two-sample Kolmogorov-Smirnov test. This tests the hypothesis that the two samples come from the same distribution.
doc kstest2
##### 3 CommentsShow 1 older commentHide 1 older comment
José-Luis on 8 Feb 2013
I am not sure I follow. It sounds like the KS test is what you are looking for. The documentation says that the sample comes from continous distributions. It says nothing about the sample themselves, which is what you are comparing. Or maybe I am missing something.
John on 8 Feb 2013
Thanks Jose, I confused the two terminology. That is what I wanted.
Cheers

Sean on 8 Feb 2013
How abot anything here:
Or some of the anova tests:
doc anova1
doc anova2
doc anovan
John on 8 Feb 2013
Edited: John on 8 Feb 2013
Thanks,
I have recorded the distances of thousands of car journeys (to the nearest mile). I have a model that generates journey distances also. I want to determine if the journey distances produced by the model are from the same distribution as the real-world data. I'm not a stats expert either :( . I have looked at the docs and they refer continious data (mine is discrete) so I'm not sure if they are suitbale?
José-Luis on 8 Feb 2013
The KS test if for discrete data. What you assume is that the distribution they come from is continuous. That's a different thing.