How to check if data is normally distributed
592 views (last 30 days)
I want to run a f-test on two samples to see if their variances are independent. Wikipedia says that the f test is sensitive to non normality of sample (<http://en.wikipedia.org/wiki/F-test)>. How can I check if my samples are normally distributed or not.
I read some forums which said I can use kstest and lillietest. When can I use either? I get an answer h=0. Does that mean my data is normally distributed?
Tom Lane on 7 Aug 2012
The functions you mention return H=0 when a test cannot reject the hypothesis of a normal distribution. They can't prove that the distribution is normal, but they don't find much evidence against that hypothesis.
The VARTESTN function has an option that is robust to non-normal distributions.
More Answers (2)
Sean on 7 Aug 2012
You cannot tell from only 2 samples whether they are normally distributed or not. If you have a larger sample set and you are only testing them in pairs, then you could use the larger sample set to test for a particular distribution.
For example: (simple q-q plot)
data= randn(100); %generate random normally distributed 100x100 matrix
ref1= randn(100); %generate random normally distributed 100x100 matrix
ref2= rand(100); %generate random uniformly distributed 100x100 matrix
The first plot should be a straight line (indicating that the data distribution matches the reference distribution. The second plot isn't a straight line, indicating that the distributions do not match.