# indexcrossentropy

## Syntax

## Description

The index cross-entropy operation computes the cross-entropy loss between network predictions and targets specified as integer class indices for single-label classification tasks.

Index cross-entropy loss, also known as *sparse cross-entropy loss*, is a
more memory and computationally efficient alternative to the standard cross-entropy
loss algorithm. It does not require binary or one-hot encoded targets. Instead, the
function requires targets specified as integer class indices. Index cross-entropy
loss is particularly well-suited to targets that span many classes, where one-hot
encoded data presents unnecessary memory overhead.

calculates the categorical cross-entropy loss between the formatted predictions
`loss`

= indexcrossentropy(`Y`

,`targets`

)`Y`

and the integer class indices `targets`

for
single-label classification tasks.

For unformatted input data, use the `DataFormat`

argument.

specifies options using one or more name-value arguments in addition to any combination of
the input arguments from previous syntaxes. For example, `loss`

= indexcrossentropy(___,`Name=Value`

)`DataFormat="BC"`

specifies that the first and second dimensions of the input data correspond to the batch and
channel dimensions, respectively.

## Examples

### Index Cross-Entropy Loss for Single-Label Classification

Create an array of prediction scores for seven observations over five classes.

```
numClasses = 5;
numObservations = 7;
Y = rand(numClasses,numObservations);
Y = dlarray(Y,"CB");
Y = softmax(Y)
```

Y = 5(C) x 7(B) dlarray 0.2205 0.1175 0.1140 0.1153 0.1963 0.2416 0.3104 0.2415 0.1408 0.2571 0.1526 0.1056 0.2381 0.1582 0.1109 0.1842 0.2537 0.2500 0.2381 0.1677 0.2021 0.2434 0.2777 0.1583 0.2210 0.2592 0.2182 0.1605 0.1837 0.2798 0.2169 0.2612 0.2008 0.1344 0.1688

Create an array of targets specified as class indices.

T = randi(numClasses,[1 numObservations])

`T = `*1×7*
5 4 2 5 1 3 2

Compute the index cross-entropy loss between the predictions and the targets.

loss = indexcrossentropy(Y,T)

loss = 1x1 dlarray 1.5620

### Weighted Index Cross-Entropy Loss

Create an array of prediction scores for seven observations over five classes.

```
numClasses = 5;
numObservations = 7;
Y = rand(numClasses,numObservations);
Y = dlarray(Y,"CB");
Y = softmax(Y)
```

Y = 5(C) x 7(B) dlarray 0.2205 0.1175 0.1140 0.1153 0.1963 0.2416 0.3104 0.2415 0.1408 0.2571 0.1526 0.1056 0.2381 0.1582 0.1109 0.1842 0.2537 0.2500 0.2381 0.1677 0.2021 0.2434 0.2777 0.1583 0.2210 0.2592 0.2182 0.1605 0.1837 0.2798 0.2169 0.2612 0.2008 0.1344 0.1688

Create an array of targets specified as class indices.

T = randi(numClasses,[1 numObservations])

`T = `*1×7*
5 4 2 5 1 3 2

Compute the weighted cross-entropy loss between the predictions and the targets using a vector of class weights. Specify a weights format of `"UC"`

(unspecified, channel) using the `WeightsFormat`

argument.

weights = rand(1,numClasses)

`weights = `*1×5*
0.7655 0.7952 0.1869 0.4898 0.4456

`loss = indexcrossentropy(Y,T,weights,WeightsFormat="UC")`

loss = 1x1 dlarray 0.8725

## Input Arguments

`Y`

— Predictions

`dlarray`

object | numeric array

Predictions, specified as a formatted or unformatted `dlarray`

object,
or a numeric array. When `Y`

is not a formatted
`dlarray`

, you must specify the dimension format using the
`DataFormat`

argument.

If `Y`

is a numeric array, `targets`

must be a
`dlarray`

object.

`targets`

— Target classification labels

`dlarray`

object | numeric array

Target classification labels, specified as a formatted or unformatted
`dlarray`

object, or a numeric array.

Specify the targets as an array containing integer class indices with the same size
and format as `Y`

, excluding the channel dimension. Each element of
`targets`

must be a positive integer less than or equal to the size
of the channel dimension of `Y`

(the number of classes), or equal to
the `MaskIndex`

argument value.

If `targets`

and `Y`

are formatted
`dlarray`

objects, then the format of `targets`

must
be the same as the format of `Y`

, excluding the
`"C"`

(channel) dimension. If `targets`

is a
formatted `dlarray`

object and `Y`

is not a formatted
`dlarray`

object, then the format of `targets`

must
be the same as the `DataFormat`

argument value, excluding the
`"C"`

(channel) dimension.

If `targets`

is an unformatted `dlarray`

or a
numeric array, then the function applies the format of `Y`

or the
value of `DataFormat`

to `targets`

.

**Tip**

Formatted `dlarray`

objects automatically permute the dimensions of the
underlying data to have the order `"S"`

(spatial), `"C"`

(channel), `"B"`

(batch), `"T"`

(time), then
`"U"`

(unspecified). To ensure that the dimensions of
`Y`

and `targets`

are consistent, when
`Y`

is a formatted `dlarray`

, also specify
`targets`

as a formatted `dlarray`

.

`weights`

— Weights

`dlarray`

object | numeric array

Weights, specified as a `dlarray`

object or a numeric array.

To specify class weights, specify a vector with a `"C"`

(channel) dimension
with size matching the `"C"`

(channel) dimension of
`Y`

and a singleton `"U"`

(unspecified)
dimension. Specify the dimensions of the class weights by using a formatted
`dlarray`

object or by using the `WeightsFormat`

argument.

To specify observation weights, specify a vector with a `"B"`

(batch)
dimension with size matching the `"B"`

(batch) dimension of
`Y`

. Specify the `"B"`

(batch) dimension of the
class weights by using a formatted `dlarray`

object or by using the
`WeightsFormat`

argument.

To specify weights for each element of the input independently, specify the weights as an
array of the same size as `Y`

. In this case, if
`weights`

is not a formatted `dlarray`

object, then
the function uses the same format as `Y`

. Alternatively, specify the
weights format using the `WeightsFormat`

argument.

### Name-Value Arguments

Specify optional pairs of arguments as
`Name1=Value1,...,NameN=ValueN`

, where `Name`

is
the argument name and `Value`

is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.

**Example: **`indexcrossentropy(Y,T,DataFormat="BC")`

specifies that the
first and second dimension of the input data correspond to the batch and channel dimensions,
respectively.

`MaskIndex`

— Masked value index

`0`

(default) | numeric scalar

Masked value index, specified as a numeric scalar.

The function excludes elements of the input data from loss computation when the target elements match the mask index.

**Data Types: **`single`

| `double`

| `int8`

| `int16`

| `int32`

| `int64`

| `uint8`

| `uint16`

| `uint32`

| `uint64`

`Reduction`

— Loss value array reduction mode

`"sum"`

(default) | `"none"`

Loss value array reduction mode, specified as `"sum"`

or
`"none"`

.

If the `Reduction`

argument is `"sum"`

, then the function
sums all elements in the array of loss values. In this case, the output
`loss`

is a scalar.

If the `Reduction`

argument is `"none"`

, then the
function does not reduce the array of loss values. In this case, the output
`loss`

is an unformatted `dlarray`

object
of the same size as `Y`

.

`NormalizationFactor`

— Divisor for normalizing reduced loss

`"batch-size"`

(default) | `"all-elements"`

| `"target-included"`

| `"none"`

Divisor for normalizing the reduced loss, specified as one of these options:

`"batch-size"`

— Normalize the loss by dividing it by the number of observations in`Y`

.`"all-elements"`

— Normalize the loss by dividing it by the number of elements of`Y`

.`"target-included"`

— Normalize the loss by dividing the loss values by the product of the number of observations and the number of elements that are not excluded according to the`MaskIndex`

argument.`"none"`

— Do not normalize the loss.

If `Reduction`

is `"none"`

, then this option
has no effect.

`DataFormat`

— Description of data dimensions

character vector | string scalar

Description of the data dimensions, specified as a character vector or string scalar.

A data format is a string of characters, where each character describes the type of the corresponding data dimension.

The characters are:

`"S"`

— Spatial`"C"`

— Channel`"B"`

— Batch`"T"`

— Time`"U"`

— Unspecified

For example, consider an array containing a batch of sequences where the first, second,
and third dimensions correspond to channels, observations, and time steps, respectively. You
can specify that this array has the format `"CBT"`

(channel, batch,
time).

You can specify multiple dimensions labeled `"S"`

or `"U"`

.
You can use the labels `"C"`

, `"B"`

, and
`"T"`

once each, at most. The software ignores singleton trailing
`"U"`

dimensions after the second dimension.

If the input data is not a formatted `dlarray`

object, then you must
specify the `DataFormat`

option.

For more information, see Deep Learning Data Formats.

**Data Types: **`char`

| `string`

`WeightsFormat`

— Description of dimensions of weights

character vector | string scalar

Description of the dimensions of the weights, specified as a character vector or string scalar.

A data format is a string of characters, where each character describes the type of the corresponding data dimension.

The characters are:

`"S"`

— Spatial`"C"`

— Channel`"B"`

— Batch`"T"`

— Time`"U"`

— Unspecified

For example, consider an array containing a batch of sequences where the first, second,
and third dimensions correspond to channels, observations, and time steps, respectively. You
can specify that this array has the format `"CBT"`

(channel, batch,
time).

You can specify multiple dimensions labeled `"S"`

or `"U"`

.
You can use the labels `"C"`

, `"B"`

, and
`"T"`

once each, at most. The software ignores singleton trailing
`"U"`

dimensions after the second dimension.

If `weights`

is a numeric vector and
`Y`

has two or more nonsingleton
dimensions, then you must specify the
`WeightsFormat`

option.

If `weights`

is not a vector, or
`weights`

and
`Y`

are both vectors, then the
default value of `WeightsFormat`

is the same
as the format of `Y`

.

For more information, see Deep Learning Data Formats.

**Data Types: **`char`

| `string`

## Output Arguments

`loss`

— Index cross-entropy loss

unformatted `dlarray`

object

Index cross-entropy loss, returned as an unformatted `dlarray`

object with the same underlying data type as the input `Y`

.

If the `Reduction`

argument is `"sum"`

, then the function
sums all elements in the array of loss values. In this case, the output
`loss`

is a scalar.

If the `Reduction`

argument is `"none"`

, then the
function does not reduce the array of loss values. In this case, the output
`loss`

is an unformatted `dlarray`

object
of the same size as `Y`

.

## Algorithms

### Index Cross-Entropy Loss

Index cross-entropy loss, also known as *sparse cross-entropy loss*, is a
more memory and computationally efficient alternative to the standard cross-entropy
loss algorithm. It does not require binary or one-hot encoded targets. Instead, the
function requires targets specified as integer class indices. Index cross-entropy
loss is particularly well-suited to targets that span many classes, where one-hot
encoded data presents unnecessary memory overhead.

In particular, for each prediction in the input, the standard cross-entropy loss
function requires targets specified as 1-by-*K* vectors, each containing
only one nonzero element. To avoid the dense encoding of the zero and nonzero elements, the
index cross-entropy function requires targets specified as scalars that represent the
indices of the nonzero elements.

For single-label classification, the standard cross-entropy function uses the formula

$$\text{loss}=-\frac{1}{N}{\displaystyle \sum _{n=1}^{N}{\displaystyle \sum}_{i=1}^{K}}{T}_{n,i}\text{ln}{Y}_{n,i},$$

where *T* is an array of one-hot encoded targets,
*Y* is an array of predictions, and *N* and
*K* are the numbers of observations and classes, respectively.

For single-label classification, the index cross-entropy loss function uses the formula:

$$\text{loss}=-\frac{1}{N}{\displaystyle \sum _{n=1}^{N}\mathrm{ln}{Y}_{n,{T}_{n}}},$$

where *T* is an array of targets, specified as class
indices.

This table shows the index cross-entropy loss formulas for different tasks.

Task | Description | Loss |
---|---|---|

Single-label classification | Index cross-entropy loss for mutually exclusive classes. This is useful when observations must have only a single label. |
$$\text{loss}=-\frac{1}{N}{\displaystyle \sum _{n=1}^{N}\mathrm{ln}{Y}_{n,{T}_{n}}},$$ where |

Single-label classification with weighted classes | Index cross-entropy loss with class weights. This is useful for datasets with imbalanced classes. |
$$\text{loss}=-\frac{1}{N}{\displaystyle \sum _{n=1}^{N}{w}_{{T}_{n}}}\text{ln}{Y}_{n,{T}_{n}},$$ where i. |

Sequence-to-sequence classification | Index cross-entropy loss with masked time steps. This is useful for ignoring loss values that correspond to padded data. |
$$\text{loss}=-\frac{1}{N}{\displaystyle \sum _{n=1}^{N}{\displaystyle \sum}_{t=1}^{S}\left[{Y}_{n,t,{T}_{n,t}}=m\right]{\displaystyle \sum}_{i=1}^{K}}\text{ln}{Y}_{n,t,{T}_{n,t}},$$ where $$\left[x=m\right]=\{\begin{array}{cc}\begin{array}{c}1\\ 0\end{array}& \begin{array}{c}\text{if}x=m\\ \text{if}x\ne m\end{array}\end{array},$$ and |

### Deep Learning Array Formats

Most deep learning networks and functions operate on different dimensions of the input data in different ways.

For example, an LSTM operation iterates over the time dimension of the input data, and a batch normalization operation normalizes over the batch dimension of the input data.

To provide input data with labeled dimensions or input data with additional layout information, you can use *data formats*.

A data format is a string of characters, where each character describes the type of the corresponding data dimension.

The characters are:

`"S"`

— Spatial`"C"`

— Channel`"B"`

— Batch`"T"`

— Time`"U"`

— Unspecified

For example, consider an array containing a batch of sequences where the first, second,
and third dimensions correspond to channels, observations, and time steps, respectively. You
can specify that this array has the format `"CBT"`

(channel, batch,
time).

To create formatted input data, create a `dlarray`

object and specify the format using the second argument.

To provide additional layout information with unformatted data, specify the formats using the `DataFormat`

and `WeightsFormat`

arguments.

For more information, see Deep Learning Data Formats.

## Extended Capabilities

### GPU Arrays

Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

The `indexcrossentropy`

function
supports GPU array input with these usage notes and limitations:

When at least one of these input arguments is a

`gpuArray`

or a`dlarray`

with underlying data of type`gpuArray`

, this function runs on the GPU:`Y`

`targets`

`weights`

`MaskIndex`

For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

## Version History

**Introduced in R2024b**

## See Also

`dlarray`

| `dlgradient`

| `dlfeval`

| `crossentropy`

| `softmax`

| `sigmoid`

| `huber`

| `l1loss`

| `l2loss`

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)