What is the difference between oobPredict and predict with ensemble of bagged decision trees?

Question

Faranak on 26 Aug 2024

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/2147934-what-is-the-difference-between-oobpredict-and-predict-with-ensemble-of-bagged-decision-trees

Commented: Faranak on 29 Aug 2024

1- I am using both fuctions to predict a response through random forest, but the predict function gives higher percentage of explained variance compared to oobPredict. Why is it so? - I think there is some fundamental thing that I have not yet fully grasped.

2- If there is something different between these methods in the way that they weigh trees how can I make these methods homogenous?

3- Can one use oobPredict in someway to make predictions with a new set of data?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Malay Agarwal on 26 Aug 2024

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/2147934-what-is-the-difference-between-oobpredict-and-predict-with-ensemble-of-bagged-decision-trees#answer_1505259

Edited: Malay Agarwal on 26 Aug 2024

Hi @Faranak,

The "oobPredict" function is used to get a more realistic estimate of the performance of the model. For each data sample, the function only considers those trees for which the sample was out-of-bag during training. In other words, it only considers those trees which have not seen the sample during training. Since the trees have not seen the sample, the prediction can be incorrect and contribute to the model's error. This can lead to a lower percentage of explained variance.

On the other hand, the "predict" function uses all the trees to obtain a prediction for a sample. If the sample is from the training set, at least one tree must have seen the sample during training and the model can account for more of the variance in the dataset.

This is similar to having a training set and a validation set when training a neural network (https://en.wikipedia.org/wiki/Training,_validation,_and_test_data_sets).The network will always report a higher error and explain less of the variance on the validation set since the model is not explicitly trained on those samples. The out-of-bag samples act as the validation set since only those trees which haven't seen the sample during training have a say in the final prediction.

This is explained in the documentation of "oobPredict" (https://www.mathworks.com/help/stats/treebagger.oobpredict.html#bu0qyz1-2), albeit in a less direct manner:

"For each observation that is out of bag for at least one tree, oobPredict composes the weighted mean of the class posterior probabilities by selecting the trees in which the observation is out of bag. "

I don't think there is any way to make the outputs more homogenous since "oobPredict" will always choose a different set of trees to make a prediction for a sample as compared to the "predict" function. You can try experimenting with the "TreeWeights" name-value argument but I think that's unlikely to work since it only defines how to weigh the trees in the overall calculation of the prediction, and does not affect which trees will take part in the prediction.

Coming to your last question, the "oobPredict" function does not support making predictions on new data. It is simply to evaluate the model's performance by obtaining a less biased estimate of its error. For new data, please use the "predict" function.

Hope this helps!

1 Comment
Show -1 older commentsHide -1 older comments

Faranak on 29 Aug 2024

Thanks a lot Malay. Your answers made a lot of points clearer to me.

Sign in to comment.

What is the difference between oobPredict and predict with ensemble of bagged decision trees?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

What is the difference between oobPredict and predict with ensemble of bagged decision trees?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments