unetLayers

(To be removed) Create U-Net layers for semantic segmentation

unetLayers will be removed in a future release. Use the unet function instead. For more information, see Version History.

Syntax

lgraph = unetLayers(imageSize,numClasses)

[lgraph,outputSize] = unetLayers(imageSize,numClasses)

___ = unetLayers(imageSize,numClasses,Name,Value)

Description

lgraph = unetLayers(imageSize,numClasses) returns a U-Net network. unetLayers includes a pixel classification layer in the network to predict the categorical label for every pixel in an input image.

Use unetLayers to create the U-Net network architecture. You must train the network using the Deep Learning Toolbox™ function trainNetwork (Deep Learning Toolbox).

example

[lgraph,outputSize] = unetLayers(imageSize,numClasses) also returns the size of the output size from the U-Net network.

___ = unetLayers(imageSize,numClasses,Name,Value) specifies options using one or more name-value pair arguments. For example, unetLayers(imageSize,numClasses,'NumFirstEncoderFilters',64) additionally sets the number of output channels to 64 for the first encoder stage.

Examples

collapse all

Create U-Net Network with Custom Encoder-Decoder Depth

Create a U-Net network with an encoder-decoder depth of 3.

imageSize = [480 640 3];
numClasses = 5;
encoderDepth = 3;
lgraph = unetLayers(imageSize,numClasses,'EncoderDepth',encoderDepth)

lgraph = 
  LayerGraph with properties:

     InputNames: {'ImageInputLayer'}
    OutputNames: {'Segmentation-Layer'}
         Layers: [46x1 nnet.cnn.layer.Layer]
    Connections: [48x2 table]

Train U-Net

Load training images and pixel labels into the workspace.

dataSetDir = fullfile(toolboxdir('vision'),'visiondata','triangleImages');
imageDir = fullfile(dataSetDir,'trainingImages');
labelDir = fullfile(dataSetDir,'trainingLabels');

Define the class names and their associated label IDs.

classNames = ["triangle","background"];
labelIDs   = [255 0];

Create an imageDatastore object to store the training images. Create a pixelLabelDatastore object to store the ground truth pixel labels for the training images. Then, combine the datastores for training the network.

imds = imageDatastore(imageDir);
pxds = pixelLabelDatastore(labelDir,classNames,labelIDs);
ds = combine(imds,pxds);

Create the U-Net network.

imageSize = [32 32];
numClasses = 2;
lgraph = unetLayers(imageSize,numClasses)

lgraph = 
  LayerGraph with properties:

         Layers: [58×1 nnet.cnn.layer.Layer]
    Connections: [61×2 table]
     InputNames: {'ImageInputLayer'}
    OutputNames: {'Segmentation-Layer'}

Set training options.

options = trainingOptions('sgdm', ...
    'InitialLearnRate',1e-3, ...
    'MaxEpochs',20, ...
    'VerboseFrequency',10);

Train the network.

net = trainNetwork(ds,lgraph,options)

Training on single CPU.
Initializing input data normalization.
|========================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |   Accuracy   |     Loss     |      Rate       |
|========================================================================================|
|       1 |           1 |       00:00:04 |       75.57% |       2.4341 |          0.0010 |
|      10 |          10 |       00:00:36 |       96.02% |       0.4517 |          0.0010 |
|      20 |          20 |       00:01:13 |       97.62% |       0.2324 |          0.0010 |
|========================================================================================|


net = 
  DAGNetwork with properties:

         Layers: [58×1 nnet.cnn.layer.Layer]
    Connections: [61×2 table]
     InputNames: {'ImageInputLayer'}
    OutputNames: {'Segmentation-Layer'}

Input Arguments

collapse all

`imageSize` — Network input image size
2-element vector | 3-element vector

Network input image size, specified as a:

2-element vector in the form [height, width].
3-element vector in the form [height, width, depth]. depth is the number of image channels. Set depth to 3 for RGB images, to 1 for grayscale images, or to the number of channels for multispectral and hyperspectral images.

`numClasses` — Number of classes
integer greater than 1

Number of classes in the semantic segmentation, specified as an integer greater than 1.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: "EncoderDepth",3

`EncoderDepth` — Encoder depth
`4` (default) | positive integer

Encoder depth, specified as a positive integer. U-Net is composed of an encoder subnetwork and a corresponding decoder subnetwork. The depth of these networks determines the number of times the input image is downsampled or upsampled during processing. The encoder network downsamples the input image by a factor of 2^D, where D is the value of EncoderDepth. The decoder network upsamples the encoder network output by a factor of 2^D.

`NumFirstEncoderFilters` — Number of output channels for first encoder
`32` (default) | positive integer

Number of output channels for the first encoder stage, specified as a positive integer or vector of positive integers. In each subsequent encoder stage, the number of output channels doubles. The unetLayers function sets the number of output channels in each decoder stage to match the number in the corresponding encoder stage.

`FilterSize` — Convolutional layer filter size
`3` (default) | positive odd integer | 2-element row vector of positive odd integers

Convolutional layer filter size, specified as a positive odd integer or a 2-element row vector of positive odd integers. Typical values are in the range [3, 7].

`FilterSize`	Description
scalar	The filter is square.
2-element row vector	The filter has the size [height width].

`ConvolutionPadding` — Type of padding
`'same'` (default) | `'valid'`

Type of padding, specified as 'same' or 'valid'. The type of padding specifies the padding style for the convolution2dLayer (Deep Learning Toolbox) in the encoder and the decoder subnetworks. The spatial size of the output feature map depends on the type of padding. If you specify type of padding as:

'same' — Zero padding is applied to the inputs to convolution layers such that the output and input feature maps are the same size.
'valid' — Zero padding is not applied to the inputs to convolution layers. The convolution layer returns only values of the convolution that are computed without zero padding. The output feature map is smaller than the input feature map.

Note

To ensure that the height and width of the inputs to max-pooling layers are even, choose the network input image size to confirm to any one of these criteria:

If you specify 'ConvolutionPadding' as 'same', then the height and width of the input image must be a multiple of 2^D.
If you specify 'ConvolutionPadding' as 'valid', then the height and width of the input image must be chosen such that $\frac{1}{2^{D}} (h e i g h t - \sum_{i = 1}^{D} 2^{i} (f_{h} - 1))$ and $\frac{1}{2^{D}} (w i d t h - \sum_{i = 1}^{D} 2^{i} (f_{w} - 1))$ are multiples of 2^D.
where f_h and f_w are the height and width of the two-dimensional convolution kernel, respectively. D is the encoder depth.

Data Types: char | string

Output Arguments

collapse all

`lgraph` — Layers
`layerGraph` object

Layers that represent the U-Net network architecture, returned as a layerGraph (Deep Learning Toolbox) object.

`outputSize` — Network output image size
three-element vector

Network output image size, returned as a three-element vector of the form [height, width, channels]. channels is the number of output channels and it is equal to the number of classes specified at the input. The height and width of the output image from the network depend on the type of padding convolution.

If you specify 'ConvolutionPadding' as 'same', then the height and width of the network output image are the same as that of the network input image.
If you specify 'ConvolutionPadding' as 'valid', then the height and width of the network output image are less than that of the network input image.

Data Types: double

More About

collapse all

U-Net Architecture

The U-Net architecture consists of an encoder subnetwork and decoder subnetwork that are connected by a bridge section.
The encoder and decoder subnetworks in the U-Net architecture consists of multiple stages. EncoderDepth, which specifies the depth of the encoder and decoder subnetworks, sets the number of stages.
The stages within the U-Net encoder subnetwork consist of two sets of convolutional and ReLU layers, followed by a 2-by-2 max pooling layer. The decoder subnetwork consists of a transposed convolution layer for upsampling, followed by two sets of convolutional and ReLU layers.
The bridge section consists of two sets of convolution and ReLU layers.
The bias term of all convolutional layers is initialized to zero.
Convolution layer weights in the encoder and decoder subnetworks are initialized using the 'He' weight initialization method [2].

Tips

Use 'same' padding in convolution layers to maintain the same data size from input to output and enable the use of a broad set of input image sizes.
Use patch-based approaches for seamless segmentation of large images. You can extract image patches by using the randomPatchExtractionDatastore function.
Use 'valid' padding to prevent border artifacts while you use patch-based approaches for segmentation.
You can use the network created using unetLayers function for GPU code generation after training with trainNetwork (Deep Learning Toolbox). For details and examples, see Generate Code and Deploy Deep Neural Networks (Deep Learning Toolbox).

References

[1] Ronneberger, O., P. Fischer, and T. Brox. "U-Net: Convolutional Networks for Biomedical Image Segmentation." Medical Image Computing and Computer-Assisted Intervention (MICCAI). Vol. 9351, 2015, pp. 234–241.

[2] He, K., X. Zhang, S. Ren, and J. Sun. "Delving Deep Into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification." Proceedings of the IEEE International Conference on Computer Vision. 2015, 1026–1034.

Extended Capabilities

expand all

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Usage notes and limitations:

You can use the U-Net network for code generation. First, create the network using the unetLayers function. Then, use the trainNetwork function to train the network for segmentation. After training and evaluating the network, you can generate code for the DAGNetwork object by using GPU Coder™.

Version History

Introduced in R2018b

expand all

R2024b: Warns

The unetLayers function issue a warning that it will be removed in a future release. Use the unet function instead. The unet function returns a dlnetwork (Deep Learning Toolbox) object, which has these advantages over layerGraph objects:

dlnetwork objects support a wider range of network architectures which you can then easily train using the trainnet (Deep Learning Toolbox) function or import from external platforms.
dlnetwork objects provide more flexibility. They have wider support with current and upcoming Deep Learning Toolbox functionality.
dlnetwork objects provide a unified data type that supports network building, prediction, built-in training, compression, and custom training loops.
dlnetwork training and prediction is typically faster than DAGNetwork and SeriesNetwork training and prediction.

To update your code, replace instances of the unetLayers function with the unet function. If you want to use a custom or pretrained encoder network, specify the EncoderNetwork name-value argument.

Discouraged Usage Recommended Replacement

Discouraged Usage	Recommended Replacement
This example uses the `unetLayers` function to create a U-Net network, returned as a `LayerGraph` object. imageSize = [480 640 3]; numClasses = 5; encoderDepth = 3; lgraph = unetLayers(imageSize,numClasses,EncoderDepth=encoderDepth)	Here is equivalent code that instead uses the `unet` function to create a U-Net network, which is returned as a `dlnetwork` object. imageSize = [480 640 3]; numClasses = 5; encoderDepth = 3; encoder = unetNetwork = unet(imageSize,numClasses,EncoderDepth=encoderDepth);

This example uses the unetLayers function to create a U-Net network, returned as a LayerGraph object.

imageSize = [480 640 3];
numClasses = 5;
encoderDepth = 3;
lgraph = unetLayers(imageSize,numClasses,EncoderDepth=encoderDepth)

Here is equivalent code that instead uses the unet function to create a U-Net network, which is returned as a dlnetwork object.

imageSize = [480 640 3];
numClasses = 5;
encoderDepth = 3;
encoder = 
unetNetwork = unet(imageSize,numClasses,EncoderDepth=encoderDepth);

R2024a: To be removed

The unetLayers function will be removed in a future release.

R2019b: `NumOutputChannels` argument in `unetLayers` is renamed to `NumFirstEncoderFilters`

unetLayers argument NumOutputChannels is renamed to NumFirstEncoderFilters. NumOutputChannels will not be supported in a future release. Use NumFirstEncoderFilters instead. To update your code, replace all instances of NumOutputChannels with NumFirstEncoderFilters.

unetLayers

Syntax

Description

Examples

Create U-Net Network with Custom Encoder-Decoder Depth

Train U-Net

Input Arguments

`imageSize` — Network input image size
2-element vector | 3-element vector

`numClasses` — Number of classes
integer greater than 1

Name-Value Arguments

`EncoderDepth` — Encoder depth
`4` (default) | positive integer

`NumFirstEncoderFilters` — Number of output channels for first encoder
`32` (default) | positive integer

`FilterSize` — Convolutional layer filter size
`3` (default) | positive odd integer | 2-element row vector of positive odd integers

`ConvolutionPadding` — Type of padding
`'same'` (default) | `'valid'`

Output Arguments

`lgraph` — Layers
`layerGraph` object

`outputSize` — Network output image size
three-element vector

More About

U-Net Architecture

Tips

References

Extended Capabilities

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

R2024b: Warns

R2024a: To be removed

R2019b: `NumOutputChannels` argument in `unetLayers` is renamed to `NumFirstEncoderFilters`

See Also

Topics

unetLayers

Syntax

Description

Examples

Create U-Net Network with Custom Encoder-Decoder Depth

Train U-Net

Input Arguments

imageSize — Network input image size 2-element vector | 3-element vector

numClasses — Number of classes integer greater than 1

Name-Value Arguments

EncoderDepth — Encoder depth 4 (default) | positive integer

NumFirstEncoderFilters — Number of output channels for first encoder 32 (default) | positive integer

FilterSize — Convolutional layer filter size 3 (default) | positive odd integer | 2-element row vector of positive odd integers

ConvolutionPadding — Type of padding 'same' (default) | 'valid'

Output Arguments

lgraph — Layers layerGraph object

outputSize — Network output image size three-element vector

More About

U-Net Architecture

Tips

References

Extended Capabilities

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

R2024b: Warns

R2024a: To be removed

R2019b: NumOutputChannels argument in unetLayers is renamed to NumFirstEncoderFilters

See Also

Topics

`imageSize` — Network input image size
2-element vector | 3-element vector

`numClasses` — Number of classes
integer greater than 1

`EncoderDepth` — Encoder depth
`4` (default) | positive integer

`NumFirstEncoderFilters` — Number of output channels for first encoder
`32` (default) | positive integer

`FilterSize` — Convolutional layer filter size
`3` (default) | positive odd integer | 2-element row vector of positive odd integers

`ConvolutionPadding` — Type of padding
`'same'` (default) | `'valid'`

`lgraph` — Layers
`layerGraph` object

`outputSize` — Network output image size
three-element vector

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

R2019b: `NumOutputChannels` argument in `unetLayers` is renamed to `NumFirstEncoderFilters`