# logp

Log unconditional probability density of naive Bayes classification model for incremental learning

*Since R2021a*

## Syntax

## Description

returns the log unconditional probability densities
`lp`

= logp(`Mdl`

,`X`

)`lp`

of the observations in the predictor data `X`

using the naive Bayes classification model for incremental learning `Mdl`

. You can use `lp`

to identify outliers in the training data.

## Examples

### Detect Outliers in Streaming Data

Train a naive Bayes classification model by using `fitcnb`

, convert it to an incremental learner, and then use the incremental model to detect outliers in streaming data.

**Load and Preprocess Data**

Load the human activity data set. Randomly shuffle the data.

load humanactivity rng(1); % For reproducibility n = numel(actid); idx = randsample(n,n); X = feat(idx,:); Y = actid(idx);

For details on the data set, enter `Description`

at the command line.

**Train Naive Bayes Classification Model**

Fit a naive Bayes classification model to a random sample of about 25% of the data.

idxtt = randsample([true false false false],n,true); TTMdl = fitcnb(X(idxtt,:),Y(idxtt))

TTMdl = ClassificationNaiveBayes ResponseName: 'Y' CategoricalPredictors: [] ClassNames: [1 2 3 4 5] ScoreTransform: 'none' NumObservations: 6167 DistributionNames: {1x60 cell} DistributionParameters: {5x60 cell}

`TTMdl`

is a `ClassificationNaiveBayes`

model object representing a traditionally trained model.

**Convert Trained Model**

Convert the traditionally trained model to a naive Bayes classification model for incremental learning.

IncrementalMdl = incrementalLearner(TTMdl)

IncrementalMdl = incrementalClassificationNaiveBayes IsWarm: 1 Metrics: [1x2 table] ClassNames: [1 2 3 4 5] ScoreTransform: 'none' DistributionNames: {1x60 cell} DistributionParameters: {5x60 cell}

`IncrementalMdl`

is an `incrementalClassificationNaiveBayes`

object. `IncrementalMdl`

represents a naive Bayes classification model for incremental learning; the parameter values are the same as the parameters in `TTMdl`

.

**Detect Outliers**

Determine an unconditional density threshold for outliers by using the traditionally trained model and training data. Outliers are observations in the streaming data that yield densities lower than the threshold.

ttlp = logp(TTMdl,X(idxtt,:)); [~,lower] = isoutlier(ttlp)

lower = -336.0424

Detect these outliers in the rest of the data. Simulate a data stream by processing 1 observation at a time. At each iteration, call `logp`

to compute the log unconditional probability density of the observation and store each value.

% Preallocation idxil = ~idxtt; nil = sum(idxil); numObsPerChunk = 1; nchunk = floor(nil/numObsPerChunk); lp = zeros(nchunk,1); iso = false(nchunk,1); Xil = X(idxil,:); Yil = Y(idxil); % Incremental processing for j = 1:nchunk ibegin = min(nil,numObsPerChunk*(j-1) + 1); iend = min(nil,numObsPerChunk*j); idx = ibegin:iend; lp(j) = logp(IncrementalMdl,Xil(idx,:)); iso(j) = lp(j) < lower; end

Plot the log unconditional probability densities of the streaming data. Identify the outliers.

figure; h1 = plot(lp); hold on x = 1:nchunk; h2 = plot(x(iso),lp(iso),'r*'); h3 = yline(lower,'g--'); xlim([0 nchunk]); ylabel('Unconditional Density') xlabel('Iteration') legend([h1 h2 h3],["Log unconditional probabilities" "Outliers" "Threshold"]) hold off

## Input Arguments

`Mdl`

— Naive Bayes classification model for incremental learning

`incrementalClassificationNaiveBayes`

model object

Naive Bayes classification model for incremental learning, specified as an `incrementalClassificationNaiveBayes`

model object. You can create `Mdl`

directly or by converting a supported, traditionally trained machine learning model using the `incrementalLearner`

function. For more details, see the corresponding reference page.

You must configure `Mdl`

to compute the log conditional probability densities on a batch of observations.

If

`Mdl`

is a converted, traditionally trained model, you can compute the log conditional probabilities without any modifications.Otherwise,

`Mdl.DistributionParameters`

must be a cell matrix with`Mdl.NumPredictors`

> 0 columns and at least one row, where each row corresponds to each class name in`Mdl.ClassNames`

.

`X`

— Batch of predictor data

floating-point matrix

Batch of predictor data with which to compute the log conditional probability densities, specified as an *n*-by-`Mdl.NumPredictors`

floating-point matrix.

For each * j* = 1 through

*n*, if

`X(``j`

,:)

contains at least one
`NaN`

, `lp(``j`

)

is
`NaN`

.**Data Types: **`single`

| `double`

## Output Arguments

`lp`

— Log conditional probability densities

floating-point vector

Log unconditional probability densities, returned as an *n*-by-1 floating-point vector. `lp(`

is the log unconditional probability density of the predictors evaluated at * j*)

`X(``j`

,:)

.**Data Types: **`single`

| `double`

## More About

### Unconditional Probability Density

The *unconditional probability density* of the predictors is the density's distribution marginalized over the classes.

In other words, the unconditional probability density is

$$P({X}_{1},\mathrm{..},{X}_{P})={\displaystyle \sum _{k=1}^{K}P}({X}_{1},\mathrm{..},{X}_{P},Y=k)={\displaystyle \sum _{k=1}^{K}P}({X}_{1},\mathrm{..},{X}_{P}|y=k)\pi (Y=k),$$

where *π*(*Y* = *k*) is the class prior probability. The conditional distribution of the data given the class (*P*(*X*_{1},..,*X _{P}*|

*y*=

*k*)) and the class prior probability distributions are training options (that is, you specify them when training the classifier).

### Prior Probability

The *prior
probability* of a class is the assumed relative frequency with which observations
from that class occur in a population.

## Version History

**Introduced in R2021a**

## See Also

### Objects

### Functions

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)