Group normalization layer
A group normalization layer normalizes a mini-batch of data across grouped subsets of channels for each observation independently. To speed up training of the convolutional neural network and reduce the sensitivity to network initialization, use group normalization layers between convolutional layers and nonlinearities, such as ReLU layers.
After normalization, the layer scales the input with a learnable scale factor γ and shifts by a learnable offset β.
creates a group normalization layer and sets the optional layer
= groupNormalizationLayer(numGroups
,Name,Value
)'Epsilon'
, Parameters and Initialization, Learn Rate and Regularization, and Name
properties using one or more name-value pair arguments.
You can specify multiple name-value pair arguments. Enclose each property name in
quotes.
The group normalization operation normalizes the elements xi of the input by first calculating the mean μG and variance σG2 over spatial, time, and grouped subsets of the channel dimensions for each observation independently. Then, it calculates the normalized activations as
where ϵ is a constant that improves numerical stability when the variance is very small. To allow for the possibility that inputs with zero mean and unit variance are not optimal for the operations that follow group normalization, the group normalization operation further shifts and scales the activations using the transformation
where the offset β and scale factor γ are learnable parameters that are updated during network training.
[1] Wu, Yuxin, and Kaiming He. “Group Normalization.” ArXiv:1803.08494 [Cs], June 11, 2018. http://arxiv.org/abs/1803.08494.
batchNormalizationLayer
| convolution2dLayer
| fullyConnectedLayer
| instanceNormalizationLayer
| layerNormalizationLayer
| reluLayer
| trainingOptions
| trainNetwork