Main Content


Non-quantized ROI pooling layer for Mask-CNN

Since R2020b


An ROI align layer outputs fixed size feature maps for every rectangular ROI within an input feature map. Use this layer to create a Mask R-CNN network.

Given an input feature map of size [H W C N], where C is the number of channels and N is the number of observations, the output feature map size is [h w C sum(M)], where h and w are the specified output size. M is a vector of length N and M(i) is the number of ROIs associated with the i-th input feature map.

There are two inputs to this layer:

  • 'in' — The input feature map

  • 'roi' — A list of ROIs to pool

Use the input names when connecting or disconnecting the ROI align layer to other layers using connectLayers (Deep Learning Toolbox) or disconnectLayers (Deep Learning Toolbox) (requires Deep Learning Toolbox™).



layer = roiAlignLayer(outputSize) creates an ROI align layer with pooled output size outputSize. The outputSize input sets the OutputSize property.


layer = roiAlignLayer(outputSize,Name,Value) set properties of the ROI align layer by using one or more name-value pair arguments. Enclose each property name in quotes.

For example, roiAlignLayer([7 7],'Name','roialignlayer') creates an ROI align layer with a pooled output size of 7-by-7 pixels and name 'roialignlayer'.


expand all

Pooled output size, specified as a vector of two positive integers [h w], where h is the height and w is the width.

Data Types: double

Scale of the input feature map to the input image, specified as a positive number.

Data Types: double

Number of samples in each pooled bin, specified as 'auto' or a row vector of two positive integers. The two elements are the number of vertical and horizontal samples, respectively.

If you do not specify the sampling ratio, then the number of vertical samples has the default value ceil(roiHeight/outputHeight). Likewise, the number of horizontal samples has the default value ceil(roiWidth/outputWidth).

Data Types: double | char

Layer name, specified as a character vector or a string scalar. For Layer array input, the trainnet (Deep Learning Toolbox) and dlnetwork (Deep Learning Toolbox) functions automatically assign names to layers with the name "".

The ROIAlignLayer object stores this property as a character vector.

Data Types: char | string

Number of inputs of the layer. This layer accepts two inputs.

Data Types: double

Input names of the layer.

Data Types: cell

This property is read-only.

Number of outputs from the layer, returned as 1. This layer has a single output only.

Data Types: double

This property is read-only.

Output names, returned as {'out'}. This layer has a single output only.

Data Types: cell


collapse all

Specify the pooled output size.

outputSize = [7 7];

Create an ROI align layer named 'roialign'.

layer = roiAlignLayer(outputSize,'Name','roialign')
layer = 
  ROIAlignLayer with properties:

             Name: 'roialign'
        NumInputs: 2
       InputNames: {'in'  'roi'}
       OutputSize: [7 7]

         ROIScale: 1
    SamplingRatio: 'auto'

More About

expand all

Version History

Introduced in R2020b