Build and Run an Executable on NVIDIA Hardware Using GPU Coder App

The GPU Coder™ Support Package for NVIDIA® GPUs uses the GPU Coder product to generate CUDA® code (kernels) from the MATLAB® algorithm. These kernels run on any CUDA enabled GPU platform. The support package automates the deployment of the generated CUDA code on GPU hardware platforms such as Jetson or DRIVE

Learning Objectives

In this tutorial, you learn how to:

  • Prepare your MATLAB code for CUDA code generation by using the kernelfun pragma.

  • Create and set up a GPU Coder project.

  • Change settings to connect to the NVIDIA target board.

  • Generate and deploy CUDA executable on the target board.

  • Run the executable on the board and verify the results.

Before following getting started with this tutorial, it is recommended to familiarize yourself with the GPU Coder App. For more information, see Code Generation by Using the GPU Coder App (GPU Coder).

Tutorial Prerequisites

Target Board Requirements

  • NVIDIA DRIVE or Jetson embedded platform.

  • Ethernet crossover cable to connect the target board and host PC (if the target board cannot be connected to a local network).

  • NVIDIA CUDA toolkit installed on the board.

  • Environment variables on the target for the compilers and libraries. For information on the supported versions of the compilers and libraries and their setup, see Install and Setup Prerequisites for NVIDIA Boards.

Development Host Requirements

  • GPU Coder for code generation. For an overview and tutorials, see the Getting Started with GPU Coder (GPU Coder) page

  • NVIDIA CUDA toolkit on the host.

  • Environment variables on the host for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-party Products (GPU Coder). For setting up the environment variables, see Environment Variables (GPU Coder).

Example: Vector Addition

This tutorial uses a simple vector addition example to demonstrate the build and deployment workflow on NVIDIA GPUs. Create a MATLAB function myAdd.m that acts as the entry-point for code generation. Alternatively, use the files in the Getting Started with the GPU Coder Support Package for NVIDIA GPUs example for this tutorial. The easiest way to create CUDA code for this function is to place the coder.gpu.kernelfun pragma in the function. When the GPU Coder encounters kernelfun pragma, it attempts to parallelize all the computation within this function and then maps it to the GPU.

function out = myAdd(inp1,inp2) %#codegen
coder.gpu.kernelfun();
out = inp1 + inp2;
end

Custom Main File

To generate a CUDA executable that can be deployed to a NVIDIA target, create a custom main wrapper file main.cu, main.h that calls the entry-point function in the generated code. The main file passes a vector containing the first 100 natural numbers to the entry point function and writes the results to a myAdd.bin binary file.

//main.cu
// Include Files
#include "myAdd.h"
#include "main.h"
#include "myAdd_terminate.h"
#include "myAdd_initialize.h"
#include <stdio.h>

// Function Declarations
static void argInit_1x100_real_T(real_T result[100]);
static void main_myAdd();

// Function Definitions
static void argInit_1x100_real_T(real_T result[100])
{
  int32_T idx1;

  // Initialize each element.
  for (idx1 = 0; idx1 < 100; idx1++) {
    result[idx1] = (real_T) idx1;
  }
}

void writeToFile(real_T result[100])
{
    FILE *fid = NULL;
    fid = fopen("myAdd.bin", "wb");
    fwrite(result, sizeof(real_T), 100, fid);
    fclose(fid);
}

static void main_myAdd()
{
  real_T out[100];
  real_T b[100];
  real_T c[100];

  argInit_1x100_real_T(b);
  argInit_1x100_real_T(c);
  
  myAdd(b, c, out);
  writeToFile(out);  // Write the output to a binary file
}

// Main routine
int32_T main(int32_T, const char * const [])
{
  // Initialize the application.
  myAdd_initialize();

  // Invoke the entry-point functions.
  main_myAdd();

  // Terminate the application.
  myAdd_terminate();
  return 0;
}
//main.h
#ifndef MAIN_H
#define MAIN_H

// Include Files
#include <stddef.h>
#include <stdlib.h>
#include "rtwtypes.h"
#include "myAdd_types.h"

// Function Declarations
extern int32_T main(int32_T argc, const char * const argv[]);

#endif

GPU Coder App

To open the GPU Coder app, on the MATLAB toolstrip Apps tab, under Code Generation, click the GPU Coder app icon. You can also open the app by typing gpucoder in the MATLAB Command Window.

  1. The app opens the Select source files page. Select myAdd.m as the entry-point function. Click Next.

  2. In the Define Input Types window, enter myAdd(1:100,1:100) and click Autodefine Input Types, then click Next.

  3. You can initiate the Check for Run-Time Issues process or click Next to go to the Generate Code step.

  4. Set the Build type to Executable and the Hardware Board to NVIDIA Jetson.

  5. Click More Settings, on the Custom Code panel, enter the custom main file main.cu in the field for Additional source files. The custom main file and the header file must be in the same location as the entry-point file.

  6. Under the Hardware panel, enter the device address, user name, password, and build folder for the board.

  7. Close the Settings window and click Generate. The software generates CUDA code and deploys the executable to the folder specified. Click Next and close the app.

Run the Executable and Verify the Results

In the MATLAB command window, use the runApplication() method of the hardware object to start the executable on the target hardware.

hwobj = jetson;
pid = runApplication(hwobj,'myAdd');
### Launching the executable on the target...
Executable launched successfully with process ID 26432.
Displaying the simple runtime log for the executable...

Copy the output bin file myAdd.bin to the MATLAB environment on the host and compare the computed results with the results from MATLAB.

outputFile = [hwobj.workspaceDir '/myAdd.bin']
getFile(hwobj,outputFile);

% Simulation result from the MATLAB.
simOut = myAdd(0:99,0:99);

% Read the copied result binary file from target in MATLAB.
fId  = fopen('myAdd.bin','r');
tOut = fread(fId,'double');
diff = simOut - tOut';
fprintf('Maximum deviation is: %f\n', max(diff(:)));
Maximum deviation between MATLAB Simulation output and GPU coder output on Target is: 0.000000

See Also

| | | | | | | | |

Related Examples

More About