Create Faster R-CNN Object Detection Network
This example builds upon the Create Fast R-CNN Object Detection Network example above. It transforms a pretrained ResNet-50 network into a Faster R-CNN object detection network by adding an ROI pooling layer, a bounding box regression layer, and a region proposal network (RPN). The Faster R-CNN network can then be trained using trainFasterRCNNObjectDetector
.
Create Fast R-CNN Network
Start by creating Fast R-CNN, which forms the basis of Faster R-CNN. The Create Fast R-CNN Object Detection Network example explains this section of code in detail.
% Load a pretrained ResNet-50. net = resnet50; lgraph = layerGraph(net); % Remove the last 3 layers. layersToRemove = { 'fc1000' 'fc1000_softmax' 'ClassificationLayer_fc1000' }; lgraph = removeLayers(lgraph, layersToRemove); % Specify the number of classes the network should classify. numClasses = 2; numClassesPlusBackground = numClasses + 1; % Define new classification layers. newLayers = [ fullyConnectedLayer(numClassesPlusBackground, 'Name', 'rcnnFC') softmaxLayer('Name', 'rcnnSoftmax') classificationLayer('Name', 'rcnnClassification') ]; % Add new object classification layers. lgraph = addLayers(lgraph, newLayers); % Connect the new layers to the network. lgraph = connectLayers(lgraph, 'avg_pool', 'rcnnFC'); % Define the number of outputs of the fully connected layer. numOutputs = 4 * numClasses; % Create the box regression layers. boxRegressionLayers = [ fullyConnectedLayer(numOutputs,'Name','rcnnBoxFC') rcnnBoxRegressionLayer('Name','rcnnBoxDeltas') ]; % Add the layers to the network. lgraph = addLayers(lgraph, boxRegressionLayers); % Connect the regression layers to the layer named 'avg_pool'. lgraph = connectLayers(lgraph,'avg_pool','rcnnBoxFC'); % Select a feature extraction layer. featureExtractionLayer = 'activation_40_relu'; % Disconnect the layers attached to the selected feature extraction layer. lgraph = disconnectLayers(lgraph, featureExtractionLayer,'res5a_branch2a'); lgraph = disconnectLayers(lgraph, featureExtractionLayer,'res5a_branch1'); % Add ROI max pooling layer. outputSize = [14 14]; roiPool = roiMaxPooling2dLayer(outputSize,'Name','roiPool'); lgraph = addLayers(lgraph, roiPool); % Connect feature extraction layer to ROI max pooling layer. lgraph = connectLayers(lgraph, featureExtractionLayer,'roiPool/in'); % Connect the output of ROI max pool to the disconnected layers from above. lgraph = connectLayers(lgraph, 'roiPool','res5a_branch2a'); lgraph = connectLayers(lgraph, 'roiPool','res5a_branch1');
Add Region Proposal Network (RPN)
Faster R-CNN uses a region proposal network (RPN) to generate region proposals. An RPN produces region proposals by predicting the class, “object” or “background”, and box offsets for a set of predefined bounding box templates known as "anchor boxes". Anchor boxes are specified by providing their size, which is typically determined based on a priori knowledge of the scale and aspect ratio of objects in the training dataset.
Learn more about Anchor Boxes for Object Detection.
Define the anchor boxes and create a regionProposalLayer
.
% Define anchor boxes. anchorBoxes = [ 16 16 32 16 16 32 ]; % Create the region proposal layer. proposalLayer = regionProposalLayer(anchorBoxes,'Name','regionProposal'); lgraph = addLayers(lgraph, proposalLayer);
Add the convolution layers for RPN and connect it to the feature extraction layer selected above.
% Number of anchor boxes. numAnchors = size(anchorBoxes,1); % Number of feature maps in coming out of the feature extraction layer. numFilters = 1024; rpnLayers = [ convolution2dLayer(3, numFilters,'padding',[1 1],'Name','rpnConv3x3') reluLayer('Name','rpnRelu') ]; lgraph = addLayers(lgraph, rpnLayers); % Connect to RPN to feature extraction layer. lgraph = connectLayers(lgraph, featureExtractionLayer, 'rpnConv3x3');
Add the RPN classification output layers. The classification layer classifies each anchor as "object" or "background".
% Add RPN classification layers. rpnClsLayers = [ convolution2dLayer(1, numAnchors*2,'Name', 'rpnConv1x1ClsScores') rpnSoftmaxLayer('Name', 'rpnSoftmax') rpnClassificationLayer('Name','rpnClassification') ]; lgraph = addLayers(lgraph, rpnClsLayers); % Connect the classification layers to the RPN network. lgraph = connectLayers(lgraph, 'rpnRelu', 'rpnConv1x1ClsScores');
Add the RPN regression output layers. The regression layer predicts 4 box offsets for each anchor box.
% Add RPN regression layers. rpnRegLayers = [ convolution2dLayer(1, numAnchors*4, 'Name', 'rpnConv1x1BoxDeltas') rcnnBoxRegressionLayer('Name', 'rpnBoxDeltas'); ]; lgraph = addLayers(lgraph, rpnRegLayers); % Connect the regression layers to the RPN network. lgraph = connectLayers(lgraph, 'rpnRelu', 'rpnConv1x1BoxDeltas');
Finally, connect the classification and regression feature maps to the region proposal layer inputs, and the ROI pooling layer to the region proposal layer output.
% Connect region proposal network. lgraph = connectLayers(lgraph, 'rpnConv1x1ClsScores', 'regionProposal/scores'); lgraph = connectLayers(lgraph, 'rpnConv1x1BoxDeltas', 'regionProposal/boxDeltas'); % Connect region proposal layer to roi pooling. lgraph = connectLayers(lgraph, 'regionProposal', 'roiPool/roi'); % Show the network after adding the RPN layers. figure plot(lgraph) ylim([30 42])
The network is ready to be trained using trainFasterRCNNObjectDetector
.