Normalize data across grouped subsets of channels for each observation independently
The group normalization operation normalizes the input data
across grouped subsets of channels for each observation independently. To speed up training of
the convolutional neural network and reduce the sensitivity to network initialization, use group
normalization between convolution and nonlinear operations such as
After normalization, the operation shifts the input by a learnable offset β and scales it by a learnable scale factor γ.
groupnorm function applies the group normalization operation to
dlarray objects makes working with high
dimensional data easier by allowing you to label the dimensions. For example, you can label
which dimensions correspond to spatial, time, channel, and batch dimensions using the
"B" labels, respectively. For unspecified and other dimensions, use the
"U" label. For
dlarray object functions that operate
over particular dimensions, you can specify the dimension labels by formatting the
dlarray object directly, or by using the
applies the group normalization operation to the input data
Y = groupnorm(
X using the
specified number of groups and transforms it using the specified offset and scale
The function normalizes over grouped subsets of the
dimension and the
'T' (time), and
'U' (unspecified) dimensions of
X for each
observation in the
'B' (batch) dimension, independently.
For unformatted input data, use the
applies the group normalization operation to the unformatted
Y = groupnorm(
X with format specified by
FMT. The output
Y is an unformatted
dlarray object with dimensions
in the same order as
X. For example,
'DataFormat','SSCB' specifies data for 2-D image input with format
'SSCB' (spatial, spatial, channel, batch).
groupnorm to normalize input data across
Create the input data as a single observation of random values with a height and width of four and six channels.
height = 4; width = 4; channels = 6; observations = 1; X = rand(height,width,channels,observations); X = dlarray(X,'SSCB');
Create the learnable parameters.
offset = zeros(channels,1); scaleFactor = ones(channels,1);
Compute the group normalization. Divide the input into three groups of two channels each.
numGroups = 3; Y = groupnorm(X,numGroups,offset,scaleFactor);
X — Input data
| numeric array
Input data, specified as a formatted
dlarray, an unformatted
dlarray, or a numeric array.
X must have a
'C' (channel) dimension.
numGroups — Number of channel groups
positive integer |
Number of channel groups to normalize across, specified as a positive integer,
|positive integer||Divide the incoming channels into the specified number of groups. The specified number of groups must divide the number of channels of the input data exactly.|
|Group all incoming channels into a single group. The input data is
normalized across all channels. This operation is also known as layer
normalization. Alternatively, use |
|Treat all incoming channels as separate groups. This operation is also
known as instance normalization. Alternatively, use |
offset — Offset
dlarray | numeric array
Offset β, specified as a formatted
dlarray, or a numeric array with one nonsingleton dimension
with size matching the size of the
'C' (channel) dimension of the
offset is a formatted
dlarray object, then
the nonsingleton dimension must have label
scaleFactor — Scale factor
dlarray | numeric array
Scale factor γ, specified as a formatted
dlarray, or a numeric array with one nonsingleton
dimension with size matching the size of the
'C' (channel) dimension
of the input
scaleFactor is a formatted
then the nonsingleton dimension must have label
Specify optional pairs of arguments as
the argument name and
Value is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name in quotes.
'Epsilon',3e-5 sets the variance offset to
DataFormat — Dimension order of unformatted data
character vector | string scalar
Dimension order of unformatted input data, specified as a character vector or string
FMT that provides a label for each dimension of the data.
When you specify the format of a
dlarray object, each character provides a
label for each dimension of the data and must be one of the following:
"B"— Batch (for example, samples and observations)
"T"— Time (for example, time steps of sequences)
You can specify multiple dimensions labeled
"U". You can use the labels
"T" at most once.
You must specify
DataFormat when the input data is not a
Epsilon — Variance offset
1e-5 (default) | numeric scalar
Variance offset for preventing divide-by-zero errors, specified as the comma-separated pair
'Epsilon' and a numeric scalar greater than or equal to
Y — Normalized data
Normalized data, returned as a
dlarray. The output
Y has the same underlying data type as the input
If the input data
X is a formatted
Y has the same dimension labels as
X. If the
input data is not a formatted
Y is an
dlarray with the same dimension order as the input
The group normalization operation normalizes the elements xi of the input by first calculating the mean μG and variance σG2 over spatial, time, and grouped subsets of the channel dimensions for each observation independently. Then, it calculates the normalized activations as
where ϵ is a constant that improves numerical stability when the variance is very small. To allow for the possibility that inputs with zero mean and unit variance are not optimal for the operations that follow group normalization, the group normalization operation further shifts and scales the activations using the transformation
where the offset β and scale factor γ are learnable parameters that are updated during network training.
 Wu, Yuxin, and Kaiming He. “Group Normalization.” Preprint submitted June 11, 2018. https://arxiv.org/abs/1803.08494.
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
Usage notes and limitations:
When at least one of the following input arguments is a
dlarraywith underlying data of type
gpuArray, this function runs on the GPU:
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).