Main Content

Normalize activations across groups of channels

The group normalization operation divides the channels of the input data into groups and normalizes the activations across each group. To speed up training of convolutional neural networks and reduce the sensitivity to network initialization, use group normalization between convolution and nonlinear operations such as `relu`

. You can perform instance normalization and layer normalization by setting the appropriate number of groups.

**Note**

This function applies the group normalization operation to `dlarray`

data. If
you want to apply batch normalization within a `layerGraph`

object
or `Layer`

array, use
the following layer:

normalizes each observation in `dlY`

= groupnorm(`dlX`

,`numGroups`

,`offset`

,`scaleFactor`

)`dlX`

across groups of channels specified by
`numGroups`

, then applies a scale factor and offset to each
channel.

The normalized activation is calculated using the following formula:

$${\widehat{x}}_{i}=\frac{{x}_{i}-{\mu}_{g}}{\sqrt{{\sigma}_{g}^{2}+\epsilon}}$$

where *x _{i}* is the input activation,

`'S'`

(spatial),
`'T'`

(time), and `'U'`

(unspecified) dimensions in
`dlX`

for each group of channels. The normalized activation is offset and scaled according to the following formula:

$${y}_{i}=\gamma {\widehat{x}}_{i}+\beta .$$

The offset *β* and scale factor *γ* are specified with
the `offset`

and `scaleFactor`

arguments.

The input `dlX`

is a formatted `dlarray`

with
dimension labels. The output `dlY`

is a formatted
`dlarray`

with the same dimension labels as `dlX`

.

also specifies the dimension format `dlY`

= groupnorm(___,'DataFormat',FMT)`FMT`

when `dlX`

is
not a formatted `dlarray`

in addition to the input arguments in previous
syntaxes. The output `dlY`

is an unformatted `dlarray`

with the same dimension order as `dlX`

.

specifies options using one or more name-value pair arguments in addition to the input
arguments in previous syntaxes. For example, `dlY`

= groupnorm(___`Name,Value`

)`'Epsilon',3e-5`

sets the
variance offset.

`batchnorm`

| `dlarray`

| `dlconv`

| `dlfeval`

| `dlgradient`

| `fullyconnect`

| `relu`