Main Content

quantize

Create quantized deep neural network

Description

example

quantizedNetwork = quantize(quantObj) creates a quantized neural network object using a calibrated dlquantizer object specified as quantObj. Quantized neural network object, specified as quantizedNetwork enables visibility of the quantized layers, weights, and biases of the network, as well as quantized inference behavior.

example

quantizedNetwork = quantize(quantObj,Name,Value) creates a quantized neural network object using a calibrated dlquantizer object specified as quantObj with additional arguments specified by one or more name name-value pair arguments.

Examples

collapse all

This example shows how to use the quantize method to quantize a neural network in MATLAB.

Prepare Data

Load the pretrained and modified squeezenetmerch network.

load squeezenetmerch
net
net = 
  DAGNetwork with properties:

         Layers: [68×1 nnet.cnn.layer.Layer]
    Connections: [75×2 table]
     InputNames: {'data'}
    OutputNames: {'new_classoutput'}

Unzip the folder MerchData.zip.

unzip('MerchData.zip')
imds = imageDatastore('MerchData', ...
    'IncludeSubfolders',true, ...
    'LabelSource','foldernames');
[calData, valData] = splitEachLabel(imds, 0.7, 'randomized');

The output size of the images are changed for both calibration and validation data according to network requirements.

aug_calData = augmentedImageDatastore([227 227], calData);
aug_valData = augmentedImageDatastore([227 227], valData);

Quantization of the Network

Create dlquantizer object for the network with execution environment as MATLAB. How the network is quantized depends on the execution environment.

quantObj = dlquantizer(net,'ExecutionEnvironment','MATLAB')
quantObj = 
  dlquantizer with properties:

           NetworkObject: [1×1 DAGNetwork]
    ExecutionEnvironment: 'MATLAB'

calResults = calibrate(quantObj,aug_calData);

The quantize method creates a quantizated network object that can view all the layers and network properties

qNet = quantize(quantObj)  
qNet = 
Quantized DAGNetwork with properties:

         Layers: [68×1 nnet.cnn.layer.Layer]
    Connections: [75×2 table]
     InputNames: {'data'}
    OutputNames: {'new_classoutput'}

Use the quantizationDetails method to extract quantization details.

Make Predictions Using Both Networks

predQuantized = classify(qNet,aug_valData);    % Predictions for the quantized network 
predOriginal = classify(net,aug_valData);  % Predictions for the non-quantized network 

Relative accuracy of the quantized network as compared to the original network

ccrQuantized = mean(predOriginal==predQuantized)*100
ccrQuantized = 100

For this validation dataset the quantized network gives 100% accuracy.

Input Arguments

collapse all

dlquantizer object containing the network to quantize, calibrated using the calibrate object function.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: quantizedNetwork = quantize(quantObj,'ExponentScheme','Histogram')

Specify exponent selection scheme for quantization as 'MinMax' or 'Histogram'. The MinMax scheme evaluates the exponent based on the range information in the calibration statistics and avoids for any overflows by the capturing the range. The Histogram scheme is a distribution based scaling which evaluates an exponent to best fit the calibration data.

Example: 'ExponentScheme', 'MinMax'

Output Arguments

collapse all

Quantized neural network specified as a DAGNetwork, SeriesNetwork, yolov2ObjectDetector (Computer Vision Toolbox), or a ssdObjectDetector (Computer Vision Toolbox) object.

Version History

Introduced in R2022a