Cross Compile Deep Learning Code for ARM Neon Targets
This example shows how to cross-compile the generated deep learning code to create a library or an executable, and then deploy the library or executable on an ARM® target such as Hikey 960 or Rock 960. This example uses the codegen
command.
Cross compiling the deep learning code for ARM® targets involves these steps:
Configure the installed cross-compiler toolchain to perform compilation on the host MATLAB®. The compilation happens when you run the
codegen
command in MATLAB in the host computer.Use the
codegen
command to build the generated code and create a library or an executable on the host computer.Copy the generated library or executable and other supporting files to the target hardware. If you generate a library on the host computer, compile the copied makefile on the target to create an executable.
Run the generated executable on the target ARM hardware.
You can use this workflow for any ARM Neon target that supports the Neon|SIMD instruction set. This example is supported only for host Linux® platforms.
Prerequisites
ARM processor that supports the Neon|SIMD extension
ARM Compute Library (on the host computer)
MATLAB® Coder™
The support package MATLAB Coder Interface for Deep Learning
Deep Learning Toolbox™
The support package Deep Learning Toolbox Model for Inception-v3 Network
Image Processing Toolbox™
For deployment on armv7 (32 bit Arm Architecture) target, GNU/GCC
g++-arm-linux-gnueabihf
toolchainFor deployment on armv8 (64 bit Arm Architecture) target, GNU/GCC
g++-aarch64-linux-gnu
toolchainEnvironment variables for the cross compilers and libraries
For information about how to install the cross-compiler toolchain and set up the associated environment variable, see Cross-Compile Deep Learning Code That Uses ARM Compute Library.
The ARM Compute library version that this example uses might not be the latest version that code generation supports. For information about supported versions of libraries and about environment variables, see Prerequisites for Deep Learning with MATLAB Coder.
The code lines in this example are commented out. Uncomment them before you run the example.
This example in not supported in MATLAB Online.
The inception_predict_arm
Entry-Point Function
This example uses the Inception-V3 image classification network. A pretrained Inception-V3 network for MATLAB is available in the support package Deep Learning Toolbox Model for Inception-V3 Network. The inception_predict_arm
entry-point function loads the Inception-V3 network into a persistent network object. On subsequent calls to the function, the persistent object is reused.
type inception_predict_arm
function out = inception_predict_arm(in) persistent net; if isempty(net) net = coder.loadDeepLearningNetwork('inceptionv3','inceptionv3'); end out = net.predict(in); end
Set up a Deep Learning Configuration Object
Create a coder.ARMNEONConfig
object. Specify the version of the ARM Compute library and arm architecture.
dlcfg = coder.DeepLearningConfig('arm-compute'); dlcfg.ArmComputeVersion = '20.02.1'; dlcfg.ArmArchitecture = 'armv8'; % or 'armv7'
For classifying the input image peppers.png
, convert the image to a text file.
generateImagetoTxt('peppers.png');
First Approach: Create Static Library for Entry-Point Function on Host
In this approach, you first cross-compile the generated code to create a static library on the host computer. You then transfer the generated static library, the ARM Compute library files, the makefile, and other supporting files to the target hardware. You run the makefile on the target hardware to generate the executable. Finally, you run the executable on the target hardware.
Set Up a Code Generation Configuration Object
Create a code generation configuration object for a static library. Specify the target language as C++.
cfg = coder.config('lib'); cfg.TargetLang = 'C++';
Attach the deep learning configuration object to the code generation configuration object.
cfg.DeepLearningConfig = dlcfg;
Configure the Cross-Compiler Toolchain
Configure the cross-compiler toolchain based on the ARM Architecture of the target device.
cfg.Toolchain = 'Linaro AArch64 Linux v6.3.1';% When the Arm Architecture is armv8 cfg.Toolchain = 'Linaro AArch32 Linux v6.3.1';% When the Arm Architecture is armv7
Generate Static Library on Host Computer by Using codegen
Use the codegen
command to generate code for the entry-point function, build the generated code, and create static library for the target ARM architecture.
codegen -config cfg inception_predict_arm -args {ones(299,299,3,'single')} -d arm_compute_cc_lib -report
Copy the Generated Cross-Compiled Static Library to Target hardware
Copy the static library, the bin files, and the header files from the generated folder arm_compute_cc_lib
to the target ARM hardware. In this code line and other code lines that follow, replace:
password with your password
username with your username
hostname with the name of your device
targetDir with the destination folder for the files
system('sshpass -p password scp -r arm_compute_cc_lib/*.bin arm_compute_cc_lib/*.lib arm_compute_cc_lib/*.h arm_compute_cc_lib/*.hpp username@hostname:targetDir/');
Copy the ARM Compute Library Files to Target Hardware
The executable uses the ARM Compute library files during runtime. The target board does not need header files while generating the executable and running the executable. Copy the library to the desired path.
system(['sshpass -p password scp -r ' fullfile(getenv('ARM_COMPUTELIB'),'lib') ' username@hostname:targetDir/']);
Copy Supporting Files to Target Hardware
Copy these files to the target ARM hardware:
Makefile
Makefile_Inceptionv3
to generate executable from static library.Input Image
inputimage.txt
that you want to classify.The text file
synsetWords.tx
t that contains the ClassNames returned bynet.Layers(end).Classes
The main wrapper file
main_inception_arm.cpp
that calls the code generated for theinception_predict_arm
function.
system('sshpass -p password scp synsetWords.txt ./Makefile_Inceptionv3 ./inputimage.txt ./main_inception_arm.cpp username@hostname:targetDir/');
Create the Executable on the Target
Compile the makefile on the target to generate the executable from the static library. This makefile links the static library with the main wrapper file main_inception_arm.cpp
and generates the executable.
system('sshpass -p password ssh username@hostname "make -C targetDir -f Makefile_Inceptionv3 arm_inceptionv3 "');
Run the Executable on the Target
Run the generated executable on the target. Make sure to export LD_LIBRARY_PATH that points to the ARM Compute library files while running executable.
system('sshpass -p password ssh username@hostname "export LD_LIBRARY_PATH=targetDir/lib; cd targetDir;./inception_predict_arm.elf inputimage.txt out.txt"');
Second Approach: Create Executable for Entry-Point function on Host
In this approach, you first cross-compile the generated code to create an executable on the host computer. You then transfer the generated executable, the ARM Compute library files, and other supporting files to the target hardware. Finally, you run the executable on the target hardware.
Set Up a Code Generation Configuration Object
Create a code generation configuration object for an generating an executable. Set the target language as C++.
cfg = coder.config('exe'); cfg.TargetLang = 'C++';
Attach the deep learning configuration object to the code generation configuration object.
cfg.DeepLearningConfig = dlcfg;
Declare the main wrapper file main_inception_arm.cpp
as the custom source file.
cfg.CustomSource = 'main_inception_arm.cpp';
Configure the Cross-Compiler Toolchain
Configure the cross-compiler toolchain based on the ARM Architecture of the target device.
cfg.Toolchain = 'Linaro AArch64 Linux v6.3.1'; % When the Arm Architecture is armv8, cfg.Toolchain = 'Linaro AArch32 Linux v6.3.1';% When the Arm Architecture is armv7,
Generate Executable on the Host Computer by Using codegen
Use the codegen
command to generate code for the entry-point function, build the generated code, and create an executable for the target ARM architecture.
codegen -config cfg inception_predict_arm -args {ones(299,299,3,'single')} -d arm_compute_cc_exe -report
Copy the Generated Executable to the Target Hardware
Copy the generated executable and the bin files to the target ARM hardware. In this code line and other code lines that follow, replace:
password with your password
username with your username
hostname with the name of your device
targetDir
with the destination folder for the files
system('sshpass -p password scp -r arm_compute_cc_exe/*.bin username@hostname:targetDir/'); system('sshpass -p password scp inception_predict_arm.elf username@hostname:targetDir/');
Copy the ARM Compute Library Files to the Target Hardware
The executable uses the ARM Compute library files during runtime. It does not use header files at runtime. Copy the library files to the desired path.
system(['sshpass -p password scp -r ' fullfile(getenv('ARM_COMPUTELIB'),'lib') ' username@hostname:targetDir/']);
Copy Supporting Files to the Target Hardware
Copy these files to the target ARM hardware:
Input Image
inputimage.txt
that you want to classify.The text file
synsetWords.tx
t that contains the ClassNames returned bynet.Layers(end).Classes
The main wrapper file
main_inception_arm.cpp
that calls the code generated for theinception_predict_arm
function.
system('sshpass -p password scp synsetWords.txt ./inputimage.txt ./main_inception_arm.cpp username@hostname:targetDir/');
Run the Executable on the Target Hardware
Run the generated executable on the target. Make sure to export LD_LIBRARY_PATH that points to the ARM Compute library files while running executable.
system('sshpass -p password ssh username@hostname "export LD_LIBRARY_PATH=targetDir/lib; cd targetDir;./inception_predict_arm.elf inputimage.txt out.txt"');
Transfer the Output Data from Target to MATLAB
Copy the generated output back to the current MATLAB session on the host computer.
system('sshpass -p password scp username@hostname:targetDir/out.txt ./');
Map Prediction Scores to Labels
Map the top five prediction scores to corresponding labels in the trained network.
outputImage = mapPredictionScores; imshow(outputImage);
See Also
coder.ARMNEONConfig
| coder.DeepLearningConfig
| coder.hardware