Partial Least Squares regression - confidence interval of the predicted variable (response)
11 views (last 30 days)
Show older comments
Gustavo Lunardon
on 20 Jan 2023
Commented: Gustavo Lunardon
on 25 Jan 2023
Hello all,
I am interested in obtaining confidence intervals for the response variable of PLS (Partial Least Squares Regression). Can someone help me on that? Here is my attempt at it:
% https://nl.mathworks.com/help/stats/partial-least-squares-regression-and-principal-components-regression.html
load spectra
X = NIR; % independent variables
y = octane; % dependent variables
PLS_comp = 3; % number of PLS components
[XL,yl,XS,YS,beta,PCTVAR,mse,stats] = plsregress(X,y,PLS_comp); % PLS regression
yfit = [ones(size(X,1),1) X]*beta; % Model fit
residuals = y - yfit; % Ordinary residuals vector
alpha_stat = 0.05; % Significance level
dgf = length(y) - PLS_comp - 1; % Degree of freedom
RMSE_model = sqrt(sum(residuals.^2)/dgf); % Degree of freedom corrected root-mean squared error (standard deviation estimator)
t_Student = tinv((1-alpha_stat/2),dgf); % t-value Student distribution
delta = t_Student*RMSE_model*sqrt(1+stats.T2); % CI boundaries
figure()
set(gcf,'color','white','position',[100 100 500 500])
errorbar(y,yfit,delta,'o')
hold on; grid minor;
hline = refline([1 0]);
hline.Color = 'k';
hline.LineStyle = ':';
xlabel('Measured')
ylabel('Predicted')
Questions are:
- Is there a better (or simpler) way to do it? (maybe even using a MATLAB standard function). I tried to follow the guidelines of this paper here, in case someone is wondering about the degrees of freedom: 10.1016/j.chemolab.2009.11.003
- Is this approach correct? The confidence intervals look too big to be correct
- This T2 statistic from the stats struct is not retrievable for data outside the training data. How do I collect it for a new spectra? (if same approach is used). I cannot get confidence intervals of prediction the way I did it.
Kind regards,
Gustavo
0 Comments
Accepted Answer
Torsten
on 20 Jan 2023
I did not look into your code in detail, but I think you could use the output structure "gof" from MATLAB's "fit" together with "confint" to compare with your statistical parameters.
More Answers (0)
See Also
Categories
Find more on Regression in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!