Is it possible to use 20 variables in a multi regression on MATLAB?

2 views (last 30 days)
Is it possible to use 20 variables in a multi regression on MATLAB?

Accepted Answer

Sean de Wolski
Sean de Wolski on 14 Dec 2016
  4 Comments
Sean de Wolski
Sean de Wolski on 15 Dec 2016
Edited: Sean de Wolski on 15 Dec 2016
Kris, the whole point of fitlm is to do multilinear regression in MATLAB. If you run the command (or search "fitlm" in your favorite search engine), this should be clear. I would thus argue that this is more constructive than "Yes" because while still saying "Yes", it tells you how to get started.
If you want more detailed answers, ask more detailed questions!
dpb
dpb on 15 Dec 2016
..."this is more constructive than "Yes"
As my followup indicates, in general I agree. My comment was purposefully what is was, however, to illustrate clearly to the OP "you get only as good as what you ask".
And, of course, since OP didn't say, while it's likely a good starting point, in the event OP doesn't have a linear model in mind that a linear routine won't help at all...simply not enough known about the problem space.

Sign in to comment.

More Answers (1)

John D'Errico
John D'Errico on 14 Dec 2016
Of course it is possible. In fact, 20 can be a rather small number in some cases, OR wildly too many.
It does depend on how much data you have.
It also depends on whether your data is sufficient to estimate 20 parameters.
  4 Comments
dpb
dpb on 14 Dec 2016
As noted, I didn't say anything about "good idea", only the direct answer to the question asked... :)
W/o any other info it's impossible to say much of anything at all useful other than as Walter points out, a 20-order polynomial in x is highly likely to be problematical at best.
One way to get an idea at least is to compute condition number of the X'X design matrix...if it's off the charts, "Houston, we have a problem!" right off the bat that will need some reconsideration...
John D'Errico
John D'Errico on 15 Dec 2016
Wildly too many depends on your problem. If your data supports 10000 unknowns, then it is trivial to solve (in fact, I have code that does this all of the time, with many thousands of unknowns. It is used happily by too many users to count.)
At the other end of the spectrum, if you have what are essentially two data points, then any more than two unknowns becomes wildly too many.
There is no magic number of unknowns, no line in the sand that tells you to go no further, no sign that tells you beyond this point, there be dragons.
There are measures you can use. For example, the condition number of the system can help you to understand when you are getting into dangerous waters. But even there, you need to understand how the extent of ill-conditioning in your matrix will amplify noise in your problem, to the point where any signal in the predicted coefficients is lost.
Essentially, learn what cond tells you. Get used to using it. Even better, learn what svd tells you about the problem, but cond will tell you a lot from just one number.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!