Understanding the update equation in logistic regression/classifier
1 view (last 30 days)
Show older comments
Following a tutorial, I tried implementing the steps of building a logistic classifier below
%% logistic regression tutorial; https://machinelearningmastery.com/logistic-regression-tutorial-for-machine-learning/
% initialize variables
temp = [2.7810836 2.550537003 0
1.465489372 2.362125076 0
3.396561688 4.400293529 0
1.38807019 1.850220317 0
3.06407232 3.005305973 0
7.627531214 2.759262235 1
5.332441248 2.088626775 1
6.922596716 1.77106367 1
8.675418651 -0.2420686549 1
7.673756466 3.508563011 1]
X1 = temp(:,1)
X2 = temp(:,2)
Y = temp(:,3)
% define inline function "log_trans"
log_trans = @(x)(1 ./ (1 + exp(-x)));
% initialize parameters
B0 = 0;
B1 = 0;
B2 = 0;
alpha = 0.3;
epoc = 1;
dataSize = size(Y,1);
for i2 = 1:dataSize*10
i1 = round(mod(i2,10.0001));
x = B0*1 + B1*X1(i1) + B2*X2(i1);
prediction = log_trans(x);
B0 = B0 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*1;
B1 = B1 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X1(i1);
B2 = B2 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X2(i1);
if prediction > 0.5
Y_pred(i1,1) = 1;
else
Y_pred(i1,1) = 0;
end
if mod(i1,10) == 0
Acc(epoc,1) = ((dataSize-sum(abs(Y - Y_pred)))/(dataSize));
epoc = epoc + 1;
end
end
It works to produce the same final coefficient values the tutorial puts forth.
My question is regarding these lines:
B0 = B0 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*1;
B1 = B1 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X1(i1);
B2 = B2 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X2(i1);
I understand each iteration updates the previous coefficient values (B0, B1, B2). The update is weighted by alpha (set to 0.3 per tutorial). The remaining 3 "terms": (Y(i1) - prediction), prediction, and (1 - prediction) I cannot arrive at a satisfyingly intuitive understanding.
Prediction is a "logistic curve" (again excuse my lack of formal language) ranging from 0 to 1. Y is a column vector of labels 0 vs 1. So I intuit at least that the closer prediction is to Y(i1), the better the coefficient is performing, and so the smaller the incremental adjustment. I cannot however intuit the inclusion of prediction, and of (1 - prediction), and would appreciate some help here.
0 Comments
Answers (0)
See Also
Categories
Find more on Linear Regression in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!