Problem with obtain regression equation between 2 data sets

4 views (last 30 days)
I have 2 data sets, I think it would be easy to obtain regression equation with Matlab but I might toght wrong. because I try corecoef and regress but still unable to obtain an equation to use. I saw some other 3rd party function in fileexchange but I would like to use matlab classic functions.
here my summary till now:
when I use corecoff I get this results that I dont know what that means
also I should use this code because my data sets contains some NaNs
R = corrcoef(Ahvaz_Dec.tmax_m, Ramhormoz_Dec.tmax_m,'rows','complete');
then I searched again and found regress function when I use it by this code:
R = regress(Ahvaz_Dec.tmax_m, Ramhormoz_Dec.tmax_m);
it gaves me a number 0.9911 that I dont understand what is it and why it is'nt similar to corecoef!
I even tried polyfit using this code:
P=polyfit(Ahvaz_Dec.tmax_m, Ramhormoz_Dec.tmax_m,1);
that it gaves me just two NaN values in a cell.
Also, I used this code below:
model = fit(Ahvaz_Dec.tmax_m, Ramhormoz_Dec.tmax_m,'poly1')
figure
plot(model,Ahvaz_Dec.tmax_m,Ramhormoz_Dec.tmax_m)
but it gaves me the following error:
Error using fit>iFit (line 232) X, Y and WEIGHTS cannot have NaN values. Error in fit (line 116) [fitobj, goodness, output, convmsg] = iFit( xdatain, ydatain, fittypeobj, ...re
So I want to ask you if you please guide me on how to obtain one linear regression equation between these 2 data sets.
  2 Comments
the cyclist
the cyclist on 26 Jan 2020
Can you upload the dataset in a MAT file, so that we can test it ourselves?
BN
BN on 27 Jan 2020
Dear cyclist,
I attached my two data sets: Ahvaz_Dec.mat and Ramhormoz_Dec.mat here. I want to have an equation like Ahvaz_dec = 0.5Ramhormoz_dec+2.33, In order to fill NaN values in Ahvaz_Dec using a nearest station (Ramhormoz_Dec).
Thank you.

Sign in to comment.

Accepted Answer

Adam Danz
Adam Danz on 27 Jan 2020
Edited: Adam Danz on 27 Jan 2020
Polyfit is probably what you're looking for. From the documentation, "If either x or y contain NaN values and n < length(x), then all elements in p are NaN."
You have to remove pairs of data that contain a NaN value.
Try this
nanIdx = isnan(Ahvaz_Dec.tmax_m) | isnan(Ramhormoz_Dec.tmax_m);
P = polyfit(Ahvaz_Dec.tmax_m(~nanIdx), Ramhormoz_Dec.tmax_m(~nanIdx), 1);
p will be a 1x2 vector showing the [slope, yIntercept] which can be used to fill in the missing values (NaNs).
Alternatively you could apply fillmissing using the 'nearest' method.
  2 Comments
BN
BN on 27 Jan 2020
Thank you. Why when I try to remove pairs of data that contain a NaN value using the script:
nanIdx = isnan(Ahvaz_Dec.tmax_m | Ramhormoz_Dec.tmax_m);
P = polyfit(Ahvaz_Dec.tmax_m(~nanIdx), Ramhormoz_Dec.tmax_m(~nanIdx), 1);
I get this error:
Error using |
NaN's cannot be converted to logicals.
Adam Danz
Adam Danz on 27 Jan 2020
Edited: Adam Danz on 27 Jan 2020
I had a typo in my answer :(
Now it's corrected :)
Also, see the summary of alternatives listed in the cyclist's answer.

Sign in to comment.

More Answers (1)

the cyclist
the cyclist on 27 Jan 2020
Edited: the cyclist on 27 Jan 2020
All these methods will give the same coefficient for the y/x sloped if you scale the variables first:
rng default
N = 10;
x = randn(N,1);
y = randn(N,1);
xscale = (x-mean(x))./std(x);
yscale = (y-mean(y))./std(y);
polyfit(xscale,yscale,1)
regress(yscale,[ones(N,1) xscale])
r = corrcoef(xscale,yscale)
regress and polyfit will give the same answer as each other on either scaled or unscaled data.
corrcoef will give the same correlation coefficient on either scaled or unscaled data, but it will be different from regress and polyfit for unscaled.
But, you do need to manage the NaNs, as Adam has pointed out.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!