Deployment and Classification of Webcam Images on NVIDIA Jetson TX2 Platform

This example shows how to generate CUDA® code from a DAGNetwork object and deploy the generated code onto the NVIDIA® Jetson TX2 board using the GPU Coder™ Support Package for NVIDIA GPUs. This example uses the resnet50 deep learning network to classify images from a USB webcam video stream.

Prerequisites

Target Board Requirements

  • NVIDIA Jetson Tegra TX2 embedded platform.

  • Ethernet crossover cable to connect the target board and host PC (if the target board cannot be connected to a local network).

  • USB camera to connect to the TX2.

  • NVIDIA CUDA toolkit installed on the board.

  • NVIDIA cuDNN library (v5 or higher) on the target.

  • OpenCV 3.0 or higher library on the target for reading and displaying images/video.

  • Environment variables on the target for the compilers and libraries. For information on the supported versions of the compilers and libraries and their setup, see Install and Setup Prerequisites for NVIDIA boards.

Development Host Requirements

  • GPU Coder for code generation. For an overview and tutorials, visit the GPU Coder product page.

  • Deep Learning Toolbox™ to use a DAGNetwork object.

  • GPU Coder Interface for Deep Learning Libraries support package. To install this support package, use the Add-On Explorer.

  • NVIDIA CUDA toolkit on the host.

  • Environment variables on the host for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-party Products. For setting up the environment variables, see Environment Variables.

Create a Folder and Copy Relevant Files

The following line of code creates a folder in your current working directory (host), and copies all the relevant files into this folder.if you cannot generate files in this folder, change your current working directory run again the below command.

gpucoderdemo_setup('gpucoderdemo_resnet50');

Verify NVIDIA Support Package Installation on Host

Use the checkHardwareSupportPackageInstall function to verify that the host system is compatible to run this example.

checkHardwareSupportPackageInstall();

Connect to the NVIDIA Hardware

The GPU Coder Support Package for NVIDIA GPUs uses an SSH connection over TCP/IP to execute commands while building and running the generated CUDA code on the Jetson platform. You must therefore connect the target platform to the same network as the host computer or use an Ethernet crossover cable to connect the board directly to the host computer. Refer to the NVIDIA documentation on how to set up and configure your board.

To communicate with the NVIDIA hardware, you must create a live hardware connection object by using the jetson function. You must know the host name or IP address, username, and password of the target board to create a live hardware connection object.

hwobj= jetson('host-name','username','password');

NOTE:

In case of a connection failure, a diagnostics error message is reported on the MATLAB command line. If the connection has failed, the most likely cause is incorrect IP address or hostname.

When there are multiple live connection objects for different targets, the code generator performs remote build on the target for which a recent live object was created. To choose a hardware board for performing remote build, use the setupCodegenContext() method of the respective live hardware object. If only one live connection object was created, it is not necessary to call this method.

hwobj.setupCodegenContext;

Verify the GPU Environment on the Target

Use the coder.checkGpuInstall function and verify that the compilers and libraries needed for running this example are set up correctly.

envCfg = coder.gpuEnvConfig('jetson');
envCfg.DeepLibTarget = 'cudnn';
envCfg.DeepCodegen = 1;
envCfg.Quiet = 1;
envCfg.HardwareObject = hwobj;
coder.checkGpuInstall(envCfg);

About the ResNet-50 Network

resnet50_wrapper.m function uses a pre-trained ResNet-50 Network to classify images. ResNet-50 is a DAG Network trained on more than a million images from the ImageNet database. The output contains the categorical scores of each class the image belongs to.

type resnet50_wrapper
function out = resnet50_wrapper(im) %#codegen
% Wrapper function to call ResNet50 predict function.

%   Copyright 2019 The MathWorks, Inc.   

% This example uses OpenCV for reading frames from a web camera 
% and displaying output image. Update buildinfo to link with 
% OpenCV library available on target.
opencv_link_flags = '`pkg-config --cflags --libs opencv`';
coder.updateBuildInfo('addLinkFlags',opencv_link_flags);

% To avoid multiple loads of the network for each run, we use 
% persistent rnet
persistent rnet;
if isempty(rnet)
    rnet = resnet50();
end
out = rnet.predict(im);



end

Generate & Deploy CUDA Code on the Target

This program uses resnet50_wrapper.m, as the entry-point function for code generation. To generate a CUDA executable that can be deployed on to an NVIDIA target, create a GPU coder configuration object for generating an executable.

cfg = coder.gpuConfig('exe');

Use the coder.hardware function to create a configuration object for the Jetson platform and assign it to the Hardware property of the GPU Jetson object cfg.

cfg.Hardware = coder.hardware('NVIDIA Jetson');

Set Deep Learning Configuration to 'cudnn' or tensorrt'

cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');

In this example, code generation is done using image as an input. However, webcam stream is fed a input to the executable after deployment.

Sample image input for code generation

im=single(imread('peppers.png'));
im=imresize(im,[224,224]);

The custom main file is coded to take video as input and classifies each frame in the video sequence. The custom main file is a wrapper that calls the predict function in the generated code. Post processing steps such as displaying output on the input frame are added in the main file using OpenCV interfaces.

cfg.CustomSource=fullfile('codegen','exe','resnet50_wrapper','examples','main.h');
cfg.CustomSource=fullfile('main.cu');

To generate CUDA code and deploy it onto target, use the codegen function and pass the GPU code configuration object. After the code generation takes place on the host, the generated files are copied over and built on the target in the workspace directory.

codegen -config cfg -args {im} resnet50_wrapper -report

Run the Application on the Target

Copy the synsetWords text file from host computer to the target device by using the putFile command.

hwobj.putFile('synsetWords.txt',hwobj.workspaceDir);

Use the runApplication method of the hardware object to launch the application on the target hardware. The application will be located in the workspace directory.

hwobj.runApplication('resnet50_wrapper');

Resnet Classification output on Jetson TX2

Kill the Application

Use the killApplication method of the hardware object to kill the running application on the target.

hwobj.killApplication('resnet50_wrapper');

Run cleanup function to remove the generated files and return to the original folder.

cleanup