Main Content

Frame-Based Video Pipeline

This example shows a video processing pipeline implemented in MATLAB® using frame-based functions. This design supports HDL code generation.

The HDL Coder™ frame-to-sample optimization allows you to combine frame-based video processing functions, element-wise operations, and custom neighborhood functions using hdl.npufun and hdl.iterfun to rapidly prototype and develop algorithms for HDL code generation using MATLAB.

This example refactors a MATLAB video processing algorithm to use frame-based video processing functions that support HDL code generation. A MATLAB testbench verifies the refactored algorithm against the original. Then, the example expands the design to add support for subsampled YCbCr I/O. Finally, the example shows how to generate HDL code and a HDL testbench for the YCbCr-compatible video processing pipeline.

MATLAB Reference

The example adapts an existing MATLAB Image Processing Toolbox™ video processing algorithm for hardware deployment. The algorithm removes noise and sharpens a video before detecting and overlaying edges:

FBVP_rgbPipeline.png

The Noise Removal and Sharpening stage uses a median filter to remove noise while not further degrading sharpness. The median filtered frame is sharpened using a 2D image filter with an inverted Laplacian kernel. The Edge Emphasis stage uses a bilateral filter to smooth the sharpened frame while retaining edges. The Edge Detection stage uses a Sobel edge detector and morphological closing to detect and improve the connectivity of edges. The Overlay stage converts the binary edges to the desired color and overlays them onto the median-filtered RGB frames.

This code excerpt shows a uint8 Image Processing Toolbox implementation of the algorithm:

function [..] = FBVP_iptPipeline(..)

    % --- Setup
    % Noise removal & sharpening
    hSharp = [ ...                   % Sharpening filter coefficients
        0 0 0; ...
        0 1 0; ...
        0 0 0] - fspecial('laplacian', 0.2);
    % Edge detection
    closingSE = strel('disk', 1);    % Morphological closing structuring element
    edgeColor = cat(3, 255, 0, 255); % 1-by-1-by-3 RGB triplet
    
    % --- Noise Removal & Sharpening
    median = zeros(size(frameIn), 'like', frameIn);
    for ii = 1 : 3
        median(:,:,ii) = medfilt2(frameIn(:,:,ii));             % 3-by-3 median filter
    end
    gray  = rgb2gray(median);                                   % RGB to grayscale
    sharp = imfilter(gray, hSharp);                             % Sharpening filter
    
    % --- Emphasize Edges
    bilateral = imbilatfilt(sharp, 36864, 0.9, NeighborhoodSize=5); % Bilateral filter
    
    % --- Edge Detection
    edges  = edge(bilateral, 'Sobel', 12/255);                  % Sobel edge detector
    closed = imclose(edges, closingSE);                         % Morphological closing
    
    % --- Edge Overlay
    rgbEdges = uint8(closed .* edgeColor);                      % Color edges
    frameOut = median + rgbEdges;                               % Overlay

Frame-Based HDL Design

Convert the coefficients of the floating-point sharpening filter, hSharp, and RGB edge color, edgeColor, to fixed-point data type. You can represent hSharp using a fixdt(1,16,12) and edgeColor using a uint8. Refer to Manual Fixed-Point Conversion Best Practices (Fixed-Point Designer) (Fixed-Point Designer™) for general guidance.

In this algorithm the Image Filter, hdlimfilter, and Bilateral Filter, hdlimbilatfilt, behave differently from their respective Image Processing Toolbox functions:

  • The Image Filter, hdlimfilter, requires that the filter coefficients be in the range [-1, 1] and that the sum of the absolute coefficients is less than or equal to 1. Rescale the coefficients, hSharp, by dividing them by the maximum absolute value of the coefficients. Apply a gain to the output of the Image Filter to counteract the gain introduced by rescaling the coefficients.

  • The Bilateral Filter, hdlimbilatfilt, uses a Look-Up Table (LUT) based implementation for hardware efficiency and so does not numerically match imbilatfilt. hdlimbilatfilt also requires intensity standard deviation rather than the degree of smoothing input used with imbilatfilt. Convert the imbilatfilt degreeOfSmoothing input to hdlimbilatfilt's intensityStdDev by dividing the square root of degreeOfSmoothing by the input word length.

The frame-based functions in this example do not support 3D I/O. Configure the median filter and Overlay stage to handle individual color channels with separate variables. The top-level design function FBVP_rgbDUT.m also requires 2D inputs and outputs.

This code excerpt shows the refactored algorithm:

function [..] = FBVP_rgbDUT(..)

    % --- Setup
    % Noise removal & sharpening
    hSharp = ([ ...  % Sharpening filter coefficients
        0 0 0; ...
        0 1 0; ...
        0 0 0] - fspecial('laplacian', 0.2)) ./ max(abs(fspecial('laplacian', 0.2)) + 1, [], "all");
    
    % Edge detection
    closingSE = strel("disk", 1);     % Morphological closing structuring element
    edgeColor = uint8([255, 0, 255]); % 1-by-3 RGB triplet

    % --- Noise Removal & Sharpening
    medianR  = hdlmedfilt2(RIn);                          % 3-by-3 median filter
    medianG  = hdlmedfilt2(GIn);
    medianB  = hdlmedfilt2(BIn);
    gray     = hdlrgb2gray(medianR, medianG, medianB);    % RGB to grayscale
    sharp    = hdlimfilter(gray, hSharp, 'Replicate', 0); % Sharpening filter
    sharp    = sharp * 4;                                 % Account for hSharp scaling
    
    % --- Emphasize Edges
    bilateral = hdlimbilatfilt(sharp, sqrt(36864)/2^8, 0.9, 5, 'Replicate', 0); % Bilateral filter
    
    % --- Edge Detection
    edges  = hdledge(bilateral, 'sobel', 12);             % Sobel edge detector
    closed = hdlimclose(edges, closingSE.Neighborhood);   % Morphological closing
    
    % --- Edge Overlay    
    closedUint = uint8(closed);                           % Color edges
    edgesR = closedUint * edgeColor(1);
    edgesG = closedUint * edgeColor(2);
    edgesB = closedUint * edgeColor(3);
    
    ROut = medianR + edgesR;                              % Overlay
    GOut = medianG + edgesG;
    BOut = medianB + edgesB;

Simulation

To verify the refactored code and fixed-point conversion create a testbench to compare the reference algorithm and hardware design. The functions in this example can differ from Image Processing Toolbox due to being optimized for hardware. A best practice when developing or adapting your own algorithms for hardware is to create a testbench and iteratively refactor your algorithm.

In the testbench, FBVP_matlabTestBench.m, the videoIn object reads frames from a video source, and the imfilter and imnoise functions are used to apply a blur effect and introduce salt & pepper noise respectively. The testbench calls the Design Under Test (DUT), FrameBasedVideoPipelineMATLABDesign.m, to process each frame. The reference design, FBVP_iptPipeline.m, performs the same operations as the DUT using Image Processing Toolbox functions. The testbench compares the DUT and reference frame by calculating the Structural SIMilarity (SSIM) index and verifying that this exceeds the minimum threshold. The viewer object displays the corrupted input frame, reference output, and DUT output. Upon completion of the testbench the processed frame rate is printed to the Command Window

FBVP_matlabTestBench.png

function [..] = FBVP_matlabTestBench(..)

    tic;
    for ii = 1 : numFrames
        % --- Stimulus
        frameIn(:) = videoIn.readFrame;                               % Source frame
        frameIn(:) = imfilter(frameIn, fspecial("gaussian", 3, 0.5)); % Gaussian blur
        frameIn(:) = imnoise(frameIn, "salt & pepper");               % Salt & pepper noise
        
        % --- Design Under Test (DUT)
        % Exercise DUT
        [R, G, B] = FBVP_rgbDUT(frameIn(:,:,1), frameIn(:,:,2), frameIn(:,:,3));
        frameOut = cat(3, R, G, B);
    
        % --- Reference
        frameReference = FBVP_iptPipeline(frameIn);
    
        % --- Compare
        frameSSIM(:) = ssim(frameReference, frameOut);
        if frameSSIM < ssimThreshold
            fprintf("frame #%u: reference and design output Structural SIMilarity (SSIM) index %.3g%% is less than the %.3g%% threshold.\n", ...
                ii, 100 * frameSSIM, 100 * ssimThreshold);
            return;
        end
    
        % --- Visualize
        frameIn        = insertText(frameIn,        [10 10], {'Input'},            FontSize=14, FontColor=[1 1 1], TextBoxColor=[0 0 0]);
        frameReference = insertText(frameReference, [10 10], {'Reference Output'}, FontSize=14, FontColor=[1 1 1], TextBoxColor=[0 0 0]);
        frameOut       = insertText(frameOut,       [10 10], {'DUT Output'},       FontSize=14, FontColor=[1 1 1], TextBoxColor=[0 0 0]);
    
        viewer(cat(2, frameIn, frameReference, frameOut));
    end
    t = toc;
    
    fprintf('\n%u frames have been processed in %.2f seconds.\n', numFrames, t);
    fprintf('Average frame rate is %.2f frames/second.\n', numFrames/t);

To enable MEX generation of the testbench functions that do not support C code generation such as fprintf are declared as extrinsic. Extrinsic functions are excluded from MEX generation and are executed using interpreted mode.

To simulate the design, generate MEX of the MATLAB testbench, FBVP_matlabTestBench.m, and execute it:

codegen("FBVP_matlabTestBench.m");
FBVP_matlabTestBench_mex;
100 frames have been processed in 15.55 seconds.
Average frame rate is 6.43 frames/second.

Visually inspect the DUT and reference output and verify that the testbench completes successfully. The testbench has passed if all frames exceed the SSIM threshold and the processed frame rate is printed to the Command Window

FBVP_rgbDUTViewer.png

Debugging simulations with large frame sizes is impractical in interpreted mode due to long simulation time. However, debugging a MEX simulation is challenging due to lack of debug access into the code. To avoid these scenarios, a best practice is to develop and verify the algorithm and testbench using a thumbnail frame size. In most cases, the HDL-targeted design can be implemented with no dependence on frame size. In addition to using smaller frame sizes, when refactoring from an existing reference design, small iterative changes allow for faster identification of errors and targeted debugging. Once you are confident that the design and testbench are working correctly, increase the frame size in the testbench and use MATLAB Coder to accelerate the simulation. To increase the frame size the testbench requires only minor changes.

Chrominance Preprocessing & Postprocessing

To deploy to hardware that utilizes a subsampled chrominance-based colour space, such as 4:2:2 YCbCr, color space conversion and chroma resampling must be performed at the input and output of the RGB-based algorithm.

FBVP_ycbcrPipeline.png

Enhance the MATLAB design function to introduce support for YCbCr 4:2:2 I/O. The design function FrameBasedVideoPipelineHDLDesign.m includes the preprocessing and postprocessing stages. The Preprocessing stage uses hdlresmplchroma to resample the input YCbCr 4:2:2 to YCbCr 4:4:4 allowing for color space conversion to RGB using hdlycbcr2rgb. The output of the algorithm is converted back to YCbCr using hdlrgb2ycbcr before the chroma components are resampled from 4:4:4 to 4:2:2 to produce the final YCbCr 4:2:2 output.

This code excerpt shows the Preprocessing and Postprocessing stages:

function [..] = FBVP_ycbcrDUT(..)

    % --- Setup
    % Noise removal & sharpening...
    % Edge detection...
    
    % --- Preprocessing (YCbCr 4:2:2 I/O Only)
    [Y, Cb, Cr] = hdlresmplchroma(Y, Cb, Cr);                                       % 4:2:2 to 4:4:4
    [R, G,  B ] = hdlycbcr2rgb(Y, Cb, Cr);                                          % YCbCr to RGB
    
    % --- Noise Removal & Sharpening...
    % --- Emphasize Edges...
    % --- Edge Detection...
    % --- Edge Overlay...
    
    % --- Postprocessing (YCbCr 4:2:2 I/O Only)
    [YOut, CbOut, CrOut] = hdlrgb2ycbcr(R, G, B);                                   % RGB to YCbCr
    [YOut, CbOut, CrOut] = hdlResampleChroma444To422FixedPoint(YOut, CbOut, CrOut); % 4:4:4 to 4:2:2

The reference design, FBVP_iptPipeline.m, can be enhanced to accept YCbCr 4:2:2 inputs for validation of the Preprocessing and Postprocessing stages. This can be achieved using the rgb2ycbcr and imadjust Image Processing Toolbox functions with a custom chroma resampling function or the vision.ChromaResampler Computer Vision Toolbox™ System object. This has not been modeled for this example.

To verify the preprocessing, postprocessing and generated HDL a HDL testbench is required. The HDL testbench, FBVP_hdlTestBench.m, is derived from the MATLAB testbench. The HDL testbench converts the corrupted input video frame from the RGB to YCbCr color space, and subsamples the chroma components to model the expected AXI-Stream input. The testbench calls the HDL DUT, FBVP_ycbcrDUT.m, to process each YCbCr 4:2:2 frame. The viewer object displays the YCbCr input, DUT output, and an RGB version of the DUT output.

FBVP_hdlTestBench.png

To simulate the HDL design, generate MEX of the HDL testbench, FBVP_hdlTestBench.m, and execute it:

codegen("FBVP_hdlTestBench.m");
FBVP_hdlTestBench_mex;

FBVP_ycbcrDUTViewer.png

HDL Code Generation & Verification

Once your design is working in simulation, use HDL Coder to generate HDL for the FrameBasedVideoPipelineHDLDesign function.

Generating HDL using the HDL Coder frame-to-sample optimization generates an entity with I/O suited for AXI-Stream interfaces and provides an input and output FIFO for memory contention and backpressure. The size of the FIFOs can be configured in the HDL Coder settings.

FBVP_f2sOverview.png

Use the hdlsetuptoolpath (HDL Coder) function for configuring synthesis tools and the vsim or nclaunch (HDL Verifier™) functions to configure HDL simulators.

Programmatic Code Generation & Simulation

Enter the following commands to create a MATLAB Coder configuration object, hdlcfg, which generates HDL and a HDL testbench, and simulates the generate code using ModelSim®. To enable the frame-to-sample optimization, set the AggressiveDataflowConversion and FrameToSampleConversion coder.Hdlconfig properties to true. For details of the properties of the HDL configuration object, see coder.HdlConfig (HDL Coder).

% Default MATLAB Coder HDL config
hdlcfg = coder.config('hdl');

% Sources
hdlcfg.DesignFunctionName = "FBVP_ycbcrDUT.m";
hdlcfg.TestBenchName      = "FBVP_hdlTestBench.m";

% Enable testbench generation and simulation
hdlcfg.GenerateHDLTestBench  = true;
hdlcfg.SimulateGeneratedCode = true;

% Enable frame-to-sample conversion
hdlcfg.AggressiveDataflowConversion = true;
hdlcfg.FrameToSampleConversion      = true;

% DUT argument prototype
frameIn = VideoReader("rhinos.avi").readFrame;

% Generate HDL & HDL testbench, then simulate
codegen -config hdlcfg -args {frameIn(:,:,1), frameIn(:,:,2), frameIn(:,:,3)}

Graphical Interface Code Generation & Simulation

Enter the following command to create a new HDL Coder™ project,

coder -hdlcoder -new FrameBasedVideoPipelineProject

Add the file FBVP_ycbcrDUT.m to the project as the MATLAB Function and FBVP_hdlTestBench.m as the MATLAB testbench.

Refer to Get Started with MATLAB to HDL Workflow (HDL Coder) for a tutorial on creating and populating MATLAB HDL Coder projects.

Launch the Workflow Advisor. In the Workflow Advisor, enable the Frame to Sample optimization:

HDL Code Generation > Optimizations > General > Aggressive Dataflow Conversion

HDL Code Generation > Frame to Sample Conversion > Enable Frame to Sample Conversion (Requires Aggressive Dataflow Conversion)

Right-click the Code Generation step. Choose the option Run to selected task to run all the steps from the beginning through HDL code generation.

Examine the generated HDL code by clicking the links in the log window.

Enable HDL testbench generation and simulation by selecting the Generate test bench and Simulate generated test bench options in the Verification - Verify with HDL Test Bench step. Right-click the Verification - Verify with HDL Test Bench step. Choose the option Run to selected task to run all the required steps from the beginning through simulation of the generated HDL code.

Resources

The design was synthesized and put through place and route on an AMD Zynq® Ultrascale+™ MPSoC ZCU102.

Resource

Usage

Usage (%)

LUT

19,170

6.99

LUTRAM

1,829

1.27

FF

13,876

2.53

BRAM

8.5

0.93

DSP

75

2.98

Fmax

150.0 MHz

-

When configured for a 1080p input, the design achieves the required timing constraints for operating at 60 frames per second. The generated HDL shows discrete entities for any frame-based function that uses hdl.npufun or hdl.iterfun, as such the frame-based Vision HDL Toolbox functions hdlrgb2gray, hdlrgb2ycbcr, and hdlycbcr2rgb are found in the DUT entity.

See Also

The design functions implemented in this example can be used in Simulink through the MATLAB Function block. See the Frame-Based Video Pipeline in Simulink example for more details.