Need Predictor Importance in Random Forest Expressed as a Percentage

3 views (last 30 days)
Hi. I'm running a code to see the importance of demographics (Predictors) on my response (Complaints). I need to express the importance as percentage, as a scale of 0 to 1 (or 0% to 100%). This is the figure I am getting is attached as "RF Importance Chart". My predictors data is attached as "PredictorsOnly.xlsx" and my response data is attached as "TotalComplaintsRF.xlsx"
X = readtable('PredictorsOnly.xlsx','PreserveVariableNames',true)
Y = readtable('TotalComplaintsRF.xlsx','PreserveVariableNames',true)
t = templateTree('NumVariablesToSample','all',...
'PredictorSelection','interaction-curvature','Surrogate','on');
rng(1); % For reproducibility
Mdl = fitrensemble(X,Y,'Method','Bag','NumLearningCycles',200, ...
'Learners',t);
yHat = oobPredict(Mdl);
R2 = corr(Mdl.Y,yHat)^2
impOOB = oobPermutedPredictorImportance(Mdl);
figure
bar(impOOB)
title('Unbiased Predictor Importance Estimates')
xlabel('Predictor variable')
ylabel('Importance')
h = gca;
h.XTickLabel = Mdl.PredictorNames;
h.XTickLabelRotation = 45;
h.TickLabelInterpreter = 'none';

Accepted Answer

Pratyush Roy
Pratyush Roy on 9 Apr 2021
Hi,
oobPermutedPredictorImportance normalizes the predictor importance by the standard error (this is common practice in the field), therefore values are not strictly scaled between 0 and 1. However one can rescale predictor importance, for example:
imp(imp<0) = 0;
imp = imp./sum(imp);
Hope this helps!

More Answers (0)

Categories

Find more on Dimensionality Reduction and Feature Extraction in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!