Multiple Regression and Intercept

4 views (last 30 days)
형현
형현 on 25 May 2024
Commented: Star Strider on 27 May 2024
% Load the data from the Excel file
data = readtable('데이터(최종).xlsx', 'Sheet', 'Sheet5');
% Define the dependent variable
y = data.Arrive;
% Define the independent variables
X = [data.Price_m, data.Volme, data.Relative_y, data.Relative_m, ...
data.mine, data.debt, data.Quin, data.Cpi, data.Rate, data.Depo, ...
data.Bull, data.Sale, data.Move, data.Sub];
% Add a column of ones to the independent variables matrix for the intercept
X = [ones(size(X, 1), 1), X];
% Perform the multiple linear regression
[b, ~, ~, ~, stats] = regress(y, X);
% Display the results
disp('Regression Coefficients:');
disp(b);
disp('R-squared:');
disp(stats(1));
disp('F-statistic:');
disp(stats(2));
disp('p-value:');
disp(stats(3));
disp('Error Variance:');
disp(stats(4));
I'm going to proceed with a multilinear regression analysis with the data string called Arrive as the dependent variable, and the result is as follows. Is it ok...?
disp(stats(4));
Regression Coefficients:
1.0e+06 *
4.1453
-0.0190
0.0040
-0.0960
-0.6115
-0.0022
-0.0140
0.0259
0.0070
-0.0602
-0.0196
-0.0003
-0.0000
0.0000
0.0000
R-squared:
0.3997
F-statistic:
4.5189
p-value:
3.5809e-06
Error Variance:
3.8687e+09

Answers (1)

Star Strider
Star Strider on 25 May 2024
I see nothing wrong with the code, and it conforms to the example in the regress documentation.
The only suggestion I have is to use table indexing to replace the initial ‘X’ so —
data = readtable('데이터(최종).xlsx', 'Sheet', 'Sheet5')
data = 110x16 table
Date Arrive Price_m Volme Relative_y Relative_m mine debt Quin Cpi Rate Depo Bull Sale Move Sub ___________ __________ _______ ______ __________ __________ ______ ______ ____ ________ ____ ____ ____ _____ _____ __________ 01-Jan-2015 6.1513e+05 84.854 99.224 0.90087 1.0464 57.982 72.6 8.8 0.67762 2 25.7 57 12546 22145 9.4723e+06 01-Feb-2015 6.6337e+05 85.05 99.845 0.89997 1.0553 57.783 73.078 8.8 -0.05917 2 25.7 68.7 10484 24046 1.5481e+07 01-Mar-2015 7.7112e+05 85.322 102.01 0.89923 1.0714 57.604 73.543 8.8 0.009515 1.75 25.7 86.2 27303 10931 1.5779e+07 01-Apr-2015 6.4945e+05 85.656 102.89 0.90282 1.0768 57.445 73.994 8.8 0.030657 1.75 25.7 81.8 34230 20550 1.605e+07 01-May-2015 6.0572e+05 85.965 102.38 0.90459 1.083 57.304 74.428 8.8 0.28005 1.75 25.7 78.2 38583 15743 1.6232e+07 01-Jun-2015 6.504e+05 86.288 102.58 0.90773 1.0863 57.182 74.844 8.9 0.020023 1.5 25.7 79 33416 30593 1.6469e+07 01-Jul-2015 6.271e+05 86.579 101.88 0.9152 1.0744 57.078 75.241 9.6 0.18017 1.5 25.7 83.5 28688 25411 1.6677e+07 01-Aug-2015 6.1878e+05 86.829 101.88 0.91788 1.0769 56.991 75.616 9.7 0.13988 1.5 25.7 79.2 27539 22182 1.6866e+07 01-Sep-2015 5.5042e+05 87.085 102.37 0.91793 1.0752 56.921 75.967 9.7 -0.25942 1.5 25.7 79.1 16934 23636 1.7093e+07 01-Oct-2015 6.5344e+05 87.299 103.25 0.92311 1.0807 56.867 76.293 9.7 0 1.5 25.6 83.5 67650 39265 1.7348e+07 01-Nov-2015 6.502e+05 87.516 103.41 0.92408 1.0846 56.83 76.592 9.7 -0.18954 1.5 25.6 65.5 54509 22046 1.7536e+07 01-Dec-2015 7.0015e+05 87.657 100.56 0.91532 1.065 56.807 76.861 9.7 0.29962 1.5 25.5 54.5 21638 25647 1.7673e+07 01-Jan-2016 5.9513e+05 87.73 99.79 0.91401 1.0541 56.8 77.1 9.7 0.1704 1.5 25.4 47.8 9552 19836 1.7761e+07 01-Feb-2016 7.0941e+05 87.758 98.834 0.91328 1.0468 56.807 77.312 9.7 0.42843 1.5 25.4 47.2 10983 29027 1.7954e+07 01-Mar-2016 6.8613e+05 87.778 97.904 0.91169 1.036 56.828 77.505 9.7 -0.25826 1.5 25.4 46 22978 11587 1.8112e+07 01-Apr-2016 5.6423e+05 87.787 97.601 0.9105 1.0316 56.863 77.681 9.7 0.18869 1.5 25.4 49.3 30154 24567 1.822e+07
% Define the dependent variable
y = data.Arrive;
% Add a column of ones to the independent variables matrix for the intercept, Add a column of ones to the independent variables matrix for the intercept
X = [ones(size(data{:,1})) data{:,3:end}];
% Perform the multiple linear regression
[b, ~, ~, ~, stats] = regress(y, X);
% Display the results
disp('Regression Coefficients:');
Regression Coefficients:
disp(b);
1.0e+06 * 4.1453 -0.0190 0.0040 -0.0960 -0.6115 -0.0022 -0.0140 0.0259 0.0070 -0.0602 -0.0196 -0.0003 -0.0000 0.0000 0.0000
disp('R-squared:');
R-squared:
disp(stats(1));
0.3997
disp('F-statistic:');
F-statistic:
disp(stats(2));
4.5189
disp('p-value:');
p-value:
disp(stats(3));
3.5809e-06
disp('Error Variance:');
Error Variance:
disp(stats(4));
3.8687e+09
This is slightly more efficient code, and the result is the same.
.
  2 Comments
형현
형현 on 27 May 2024
Is there no problem with statistical significance? When you look at R^2 or regression coefficients...
Star Strider
Star Strider on 27 May 2024
There is a problem with statistical significance, because only four variables (including the Intercept term) are statistically significant, in the usual sense of having . I used fitlm to get those statistics —
data = readtable('데이터(최종).xlsx', 'Sheet', 'Sheet5')
data = 110x16 table
Date Arrive Price_m Volme Relative_y Relative_m mine debt Quin Cpi Rate Depo Bull Sale Move Sub ___________ __________ _______ ______ __________ __________ ______ ______ ____ ________ ____ ____ ____ _____ _____ __________ 01-Jan-2015 6.1513e+05 84.854 99.224 0.90087 1.0464 57.982 72.6 8.8 0.67762 2 25.7 57 12546 22145 9.4723e+06 01-Feb-2015 6.6337e+05 85.05 99.845 0.89997 1.0553 57.783 73.078 8.8 -0.05917 2 25.7 68.7 10484 24046 1.5481e+07 01-Mar-2015 7.7112e+05 85.322 102.01 0.89923 1.0714 57.604 73.543 8.8 0.009515 1.75 25.7 86.2 27303 10931 1.5779e+07 01-Apr-2015 6.4945e+05 85.656 102.89 0.90282 1.0768 57.445 73.994 8.8 0.030657 1.75 25.7 81.8 34230 20550 1.605e+07 01-May-2015 6.0572e+05 85.965 102.38 0.90459 1.083 57.304 74.428 8.8 0.28005 1.75 25.7 78.2 38583 15743 1.6232e+07 01-Jun-2015 6.504e+05 86.288 102.58 0.90773 1.0863 57.182 74.844 8.9 0.020023 1.5 25.7 79 33416 30593 1.6469e+07 01-Jul-2015 6.271e+05 86.579 101.88 0.9152 1.0744 57.078 75.241 9.6 0.18017 1.5 25.7 83.5 28688 25411 1.6677e+07 01-Aug-2015 6.1878e+05 86.829 101.88 0.91788 1.0769 56.991 75.616 9.7 0.13988 1.5 25.7 79.2 27539 22182 1.6866e+07 01-Sep-2015 5.5042e+05 87.085 102.37 0.91793 1.0752 56.921 75.967 9.7 -0.25942 1.5 25.7 79.1 16934 23636 1.7093e+07 01-Oct-2015 6.5344e+05 87.299 103.25 0.92311 1.0807 56.867 76.293 9.7 0 1.5 25.6 83.5 67650 39265 1.7348e+07 01-Nov-2015 6.502e+05 87.516 103.41 0.92408 1.0846 56.83 76.592 9.7 -0.18954 1.5 25.6 65.5 54509 22046 1.7536e+07 01-Dec-2015 7.0015e+05 87.657 100.56 0.91532 1.065 56.807 76.861 9.7 0.29962 1.5 25.5 54.5 21638 25647 1.7673e+07 01-Jan-2016 5.9513e+05 87.73 99.79 0.91401 1.0541 56.8 77.1 9.7 0.1704 1.5 25.4 47.8 9552 19836 1.7761e+07 01-Feb-2016 7.0941e+05 87.758 98.834 0.91328 1.0468 56.807 77.312 9.7 0.42843 1.5 25.4 47.2 10983 29027 1.7954e+07 01-Mar-2016 6.8613e+05 87.778 97.904 0.91169 1.036 56.828 77.505 9.7 -0.25826 1.5 25.4 46 22978 11587 1.8112e+07 01-Apr-2016 5.6423e+05 87.787 97.601 0.9105 1.0316 56.863 77.681 9.7 0.18869 1.5 25.4 49.3 30154 24567 1.822e+07
% Define the dependent variable
y = data.Arrive;
% Add a column of ones to the independent variables matrix for the intercept, Add a column of ones to the independent variables matrix for the intercept
X = [ones(size(data{:,1})) data{:,3:end}];
% Perform the multiple linear regression
[b, ~, ~, ~, stats] = regress(y, X);
% Display the results
disp('Regression Coefficients:');
Regression Coefficients:
disp(b);
1.0e+06 * 4.1453 -0.0190 0.0040 -0.0960 -0.6115 -0.0022 -0.0140 0.0259 0.0070 -0.0602 -0.0196 -0.0003 -0.0000 0.0000 0.0000
disp('R-squared:');
R-squared:
disp(stats(1));
0.3997
disp('F-statistic:');
F-statistic:
disp(stats(2));
4.5189
disp('p-value:');
p-value:
disp(stats(3));
3.5809e-06
disp('Error Variance:');
Error Variance:
disp(stats(4));
3.8687e+09
VN = data.Properties.VariableNames;
mdl = fitlm(data{:,3:end}, data.Arrive, 'VarNames',{VN{3:end},VN{2}})
mdl =
Linear regression model: Arrive ~ 1 + Price_m + Volme + Relative_y + Relative_m + mine + debt + Quin + Cpi + Rate + Depo + Bull + Sale + Move + Sub Estimated Coefficients: Estimate SE tStat pValue ___________ __________ ________ _________ (Intercept) 4.1453e+06 1.3912e+06 2.9797 0.0036636 Price_m -19030 8333.1 -2.2837 0.024619 Volme 3965.4 2458.5 1.613 0.11007 Relative_y -95964 4.8213e+05 -0.19904 0.84265 Relative_m -6.1154e+05 3.8759e+05 -1.5778 0.11794 mine -2239 2986.8 -0.74964 0.45532 debt -14013 10099 -1.3876 0.16852 Quin 25869 25464 1.0159 0.31227 Cpi 6957.2 20007 0.34773 0.72881 Rate -60201 30164 -1.9958 0.048817 Depo -19627 8482.7 -2.3137 0.022838 Bull -265.79 754.17 -0.35243 0.7253 Sale -0.44722 0.71444 -0.62597 0.53284 Move 0.81287 0.96829 0.83949 0.4033 Sub 0.0019539 0.0098312 0.19874 0.84289 Number of observations: 110, Error degrees of freedom: 95 Root Mean Squared Error: 6.22e+04 R-squared: 0.4, Adjusted R-Squared: 0.311 F-statistic vs. constant model: 4.52, p-value = 3.58e-06
Significant_Independent_Variables = mdl.CoefficientNames(mdl.Coefficients.pValue <= 0.05)
Significant_Independent_Variables = 1x4 cell array
{'(Intercept)'} {'Price_m'} {'Rate'} {'Depo'}
However considering the F-statistic, the regression itself is highly significant.
These are your data. I defer to you to interprret them and the regression results. (I am not even certain what the variables are.)
.

Sign in to comment.

Categories

Find more on Linear and Nonlinear Regression in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!