# Nonlinear Logistic Regression

This example shows two ways of fitting a nonlinear logistic regression model. The first method uses maximum likelihood (ML) and the second method uses generalized least squares (GLS) via the function `fitnlm` from Statistics and Machine Learning Toolbox™.

### Problem Description

Logistic regression is a special type of regression in which the goal is to model the probability of something as a function of other variables. Consider a set of predictor vectors ${x}_{1},\dots ,{x}_{N}$ where $N$ is the number of observations and ${x}_{i}$ is a column vector containing the values of the $d$ predictors for the $i$ th observation. The response variable for ${x}_{i}$ is ${Z}_{i}$ where ${Z}_{i}$ represents a Binomial random variable with parameters $n$, the number of trials, and ${\mu }_{i}$, the probability of success for trial $i$. The normalized response variable is ${Y}_{i}={Z}_{i}/n$ - the proportion of successes in $n$ trials for observation $i$. Assume that responses ${Y}_{i}$ are independent for $i=1,\dots ,N$. For each $i$:

`$E\left({Y}_{i}\right)={\mu }_{i}$`

$Var\left({Y}_{i}\right)=\frac{{\mu }_{i}\left(1-{\mu }_{i}\right)}{n}$.

Consider modeling ${\mu }_{i}$ as a function of predictor variables ${x}_{i}$.

In linear logistic regression, you can use the function `fitglm` to model ${\mu }_{i}$ as a function of ${x}_{i}$ as follows:

`$\mathrm{log}\left(\frac{{\mu }_{i}}{1-{\mu }_{i}}\right)={x}_{i}^{T}\beta$`

with $\beta$ representing a set of coefficients multiplying the predictors in ${x}_{i}$. However, suppose you need a nonlinear function on the right-hand-side:

`$\mathrm{log}\left(\frac{{\mu }_{i}}{1-{\mu }_{i}}\right)=f\left({x}_{i},\beta \right).$`

There are functions in Statistics and Machine Learning Toolbox™ for fitting nonlinear regression models, but not for fitting nonlinear logistic regression models. This example shows how you can use toolbox functions to fit those models.

### Direct Maximum Likelihood (ML)

The ML approach maximizes the log likelihood of the observed data. The likelihood is easily computed using the Binomial probability (or density) function as computed by the `binopdf` function.

### Generalized Least Squares (GLS)

You can estimate a nonlinear logistic regression model using the function `fitnlm`. This might seem surprising at first since `fitnlm` does not accommodate Binomial distribution or any link functions. However, `fitnlm` can use Generalized Least Squares (GLS) for model estimation if you specify the mean and variance of the response. If GLS converges, then it solves the same set of nonlinear equations for estimating $\beta$ as solved by ML. You can also use GLS for quasi-likelihood estimation of generalized linear models. In other words, we should get the same or equivalent solutions from GLS and ML. To implement GLS estimation, provide the nonlinear function to fit, and the variance function for the Binomial distribution.

Mean or model function

The model function describes how ${\mu }_{i}$ changes with $\beta$. For `fitnlm`, the model function is:

`${\mu }_{i}=\frac{1}{1\phantom{\rule{0.16666666666666666em}{0ex}}\phantom{\rule{0.16666666666666666em}{0ex}}+\phantom{\rule{0.16666666666666666em}{0ex}}\phantom{\rule{0.16666666666666666em}{0ex}}\text{exp}\left\{-f\left({x}_{i},\beta \right)\right\}}$`

Weight function

`fitnlm` accepts observation weights as a function handle using the `'Weights'` name-value pair argument. When using this option, `fitnlm` assumes the following model:

`$E\left({Y}_{i}\right)={\mu }_{i}$`

`$Var\left({Y}_{i}\right)=\frac{{\sigma }^{2}}{w\left({\mu }_{i}\right)}$`

where responses ${Y}_{i}$ are assumed to be independent, and $w$ is a custom function handle that accepts ${\mu }_{i}$ and returns an observation weight. In other words, the weights are inversely proportional to the response variance. For the Binomial distribution used in the logistic regression model, create the weight function as follows:

`$w\left({\mu }_{i}\right)=\frac{1}{Var\left({y}_{i}\right)}=\frac{n}{{\mu }_{i}\left(1-{\mu }_{i}\right)}$`

`fitnlm` models the variance of the response ${Y}_{i}$ as ${\sigma }^{2}/w\left({\mu }_{i}\right)$ where ${\sigma }^{2}$ is an extra parameter that is present in GLS estimation, but absent in the logistic regression model. However, this typically does not affect the estimation of $\beta$, and it provides a "dispersion parameter" to check on the assumption that the ${Z}_{i}$ values have a Binomial distribution.

An advantage of using `fitnlm` over direct ML is that you can perform hypothesis tests and compute confidence intervals on the model coefficients.

### Generate Example Data

To illustrate the differences between ML and GLS fitting, generate some example data. Assume that ${x}_{i}$ is one dimensional and suppose the true function $f$ in the nonlinear logistic regression model is the Michaelis-Menten model parameterized by a $2×1$ vector $\beta$:

`$f\left({x}_{i},\beta \right)=\frac{{\beta }_{1}{x}_{i}}{{\beta }_{2}+{x}_{i}}.$`

`myf = @(beta,x) beta(1)*x./(beta(2) + x);`

Create a model function that specifies the relationship between ${\mu }_{i}$ and $\beta$.

`mymodelfun = @(beta,x) 1./(1 + exp(-myf(beta,x)));`

Create a vector of one dimensional predictors and the true coefficient vector $\beta$.

```rng(300,'twister'); x = linspace(-1,1,200)'; beta = [10;2];```

Compute a vector of ${\mu }_{i}$ values for each predictor.

`mu = mymodelfun(beta,x);`

Generate responses ${z}_{i}$ from a Binomial distribution with success probabilities ${\mu }_{i}$ and number of trials $n$.

```n = 50; z = binornd(n,mu);```

Normalize the responses.

`y = z./n;`

### ML Approach

The ML approach defines the negative log likelihood as a function of the $\beta$ vector, and then minimizes it with an optimization function such as `fminsearch`. Specify `beta0` as the starting value for $\beta$.

```mynegloglik = @(beta) -sum(log(binopdf(z,n,mymodelfun(beta,x)))); beta0 = [3;3]; opts = optimset('fminsearch'); opts.MaxFunEvals = Inf; opts.MaxIter = 10000; betaHatML = fminsearch(mynegloglik,beta0,opts)```
```betaHatML = 2×1 9.9259 1.9720 ```

The estimated coefficients in `betaHatML` are close to the true values of `[10;2]`.

### GLS Approach

The GLS approach creates a weight function for `fitnlm` previously described.

`wfun = @(xx) n./(xx.*(1-xx));`

Call `fitnlm` with custom mean and weight functions. Specify `beta0` as the starting value for $\beta$.

`nlm = fitnlm(x,y,mymodelfun,beta0,'Weights',wfun)`
```nlm = Nonlinear regression model: y ~ F(beta,x) Estimated Coefficients: Estimate SE tStat pValue ________ _______ ______ __________ b1 9.926 0.83135 11.94 4.193e-25 b2 1.972 0.16438 11.996 2.8182e-25 Number of observations: 200, Error degrees of freedom: 198 Root Mean Squared Error: 1.16 R-Squared: 0.995, Adjusted R-Squared 0.995 F-statistic vs. zero model: 1.88e+04, p-value = 2.04e-226 ```

Get an estimate of $\beta$ from the fitted `NonLinearModel` object `nlm`.

`betaHatGLS = nlm.Coefficients.Estimate`
```betaHatGLS = 2×1 9.9260 1.9720 ```

As in the ML method, the estimated coefficients in `betaHatGLS` are close to the true values of `[10;2]`. The small p-values for both ${\beta }_{1}$ and ${\beta }_{2}$ indicate that both coefficients are significantly different from $0$.

### Compare ML and GLS Approaches

Compare estimates of $\beta$.

`max(abs(betaHatML - betaHatGLS))`
```ans = 1.1460e-05 ```

Compare fitted values using ML and GLS

```yHatML = mymodelfun(betaHatML ,x); yHatGLS = mymodelfun(betaHatGLS,x); max(abs(yHatML - yHatGLS))```
```ans = 1.2746e-07 ```

ML and GLS approaches produce similar solutions.

### Plot fitted values using ML and GLS

```figure plot(x,y,'g','LineWidth',1) hold on plot(x,yHatML ,'b' ,'LineWidth',1) plot(x,yHatGLS,'m--','LineWidth',1) legend('Data','ML','GLS','Location','Best') xlabel('x') ylabel('y and fitted values') title('Data y along with ML and GLS fits.')``` ML and GLS produce similar fitted values.

### Plot estimated nonlinear function using ML and GLS

Plot true model for $f\left({x}_{i},\beta \right)$. Add plot for the initial estimate of $f\left({x}_{i},\beta \right)$ using $\beta ={\beta }_{0}$ and plots for ML and GLS based estimates of $f\left({x}_{i},\beta \right)$.

```figure plot(x,myf(beta,x),'r','LineWidth',1) hold on plot(x,myf(beta0,x),'k','LineWidth',1) plot(x,myf(betaHatML,x),'c--','LineWidth',1) plot(x,myf(betaHatGLS,x),'b-.','LineWidth',1) legend('True f','Initial f','Estimated f with ML','Estimated f with GLS','Location','Best') xlabel('x') ylabel('True and estimated f') title('Comparison of true f with estimated f using ML and GLS.')``` The estimated nonlinear function $f$ using both ML and GLS methods is close to the true nonlinear function $f$. You can use a similar technique to fit other nonlinear generalized linear models like nonlinear Poisson regression.