# How to check if data is normally distributed

592 views (last 30 days)

Show older comments

Hi all,

I want to run a f-test on two samples to see if their variances are independent. Wikipedia says that the f test is sensitive to non normality of sample (<http://en.wikipedia.org/wiki/F-test)>. How can I check if my samples are normally distributed or not.

I read some forums which said I can use kstest and lillietest. When can I use either? I get an answer h=0. Does that mean my data is normally distributed?

Thanks. Nancy

##### 0 Comments

### Accepted Answer

Tom Lane
on 7 Aug 2012

The functions you mention return H=0 when a test cannot reject the hypothesis of a normal distribution. They can't prove that the distribution is normal, but they don't find much evidence against that hypothesis.

The VARTESTN function has an option that is robust to non-normal distributions.

##### 2 Comments

Tom Lane
on 9 Aug 2012

Suppose you would normally do

x1 = randn(20,1); x2 = 1.5*randn(25,1);

[h,p] = vartest2(x1,x2)

Then you can do something like this instead:

grp = [ones(size(x1)); 2*ones(size(x2))];

vartestn([x1;x2], grp)

I believe the two-sample vartestn test is not identical to the vartest2 test, but the p-values are likely to be similar. Then you can add options to do a robust test using vartestn.

### More Answers (2)

Sean
on 7 Aug 2012

Hello Nancy,

You cannot tell from only 2 samples whether they are normally distributed or not. If you have a larger sample set and you are only testing them in pairs, then you could use the larger sample set to test for a particular distribution.

For example: (simple q-q plot)

data= randn(100); %generate random normally distributed 100x100 matrix

ref1= randn(100); %generate random normally distributed 100x100 matrix

ref2= rand(100); %generate random uniformly distributed 100x100 matrix

x=sort(data(:));

y1=sort(ref1(:));

y2=sort(ref2(:));

subplot(1,2,1); plot(x,y1);

subplot(1,2,2); plot(x,y2);

The first plot should be a straight line (indicating that the data distribution matches the reference distribution. The second plot isn't a straight line, indicating that the distributions do not match.

Sarutahiko
on 11 Dec 2013

##### 0 Comments

### See Also

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!