I have some general questions regarding the ksdensity function. I'm trying to compare the cumulative density function of two samples, if I plot the data using a histogram they don't appear to be that different.
When I use the ksdensity (cdf) function and test for a significant difference using a K-S test, significance seems to depend a lot on the bandwith and bounding of my distributions.
My question is: what are the rules for choosing a bandwidth etc? If significance depends on the settings I choose I want to have good reasons for choosing them.
I am constraining the distributions between 0 and 500 because my maximum value is 460 and the data are all positive. However, the default bandwidth results in minor differences towards the upper end of the distributions, a bandwidth of 1.5 does not.
I have attached the data, column 1 is the group, column 2 are the measurements.
Thank you for any help,