Main Content

Vehicle Detection and Tracking in Lidar Point Clouds Using Voxel R-CNN

Since R2026a

This example shows how to detect and track vehicles using point cloud data captured by a lidar sensor mounted on an ego vehicle. To detect vehicles, you use a voxel region-based convolutional neural network (Voxel R-CNN), and to track the detected vehicles, you use a joint probabilistic data association (JPDA) tracker.

This example requires a CUDA® enabled NVIDIA® GPU.

Load Lidar Data

Download a ZIP file containing a subset of sensor data from the PandaSet data set and then unzip the file. This data set contains five sequences of lidar and camera data along with their sensor information. The size of the downloaded data set is 1.29 GB.

tmpFolder = tempdir;
dataFileName = "PandasetData090to094.zip";
url = "https://ssd.mathworks.com/supportfiles/driving/data/" + dataFileName;
filePath = fullfile(tmpFolder,dataFileName);
if ~isfile(filePath)
    websave(filePath,url);
end

dataFolder = fullfile(tmpFolder,"PandasetData090to094");
if(~exist(dataFolder,"dir"))
    mkdir(dataFolder)
end

unzip(filePath,dataFolder);

Get the lidar point cloud data, lidar positions, and camera images by using the helperLoadPandaSetData supporting function. This function returns a table containing data of all the sequences in the dataFolder directory.

sensorData = helperLoadPandaSetData(dataFolder);

sensorData is a table with these columns:

  • ptCldPath — Path of the lidar point cloud data in PCD format.

  • lidarPosition — Position of the lidar sensor in world coordinates. Units are in meters.

  • lidarHeading — Heading of the lidar sensor, specified as a quaternion in world coordinates.

  • frontCameraImagesPath — Path of the front camera images.

Create a CombinedDatastore object containing the lidar point clouds, lidar poses, and camera images by using the helperCreateCombinedDatastore helper function. The helper function also uses the lidarPosition and lidarHeading information in sensorData to create the lidar poses, required to transform the point clouds between the lidar sensor coordinate system and the world coordinate system, and stores them in a rigidtform3d object.

combinedDs = helperCreateCombinedDatastore(sensorData);

Read and display the first point cloud in the data set.

cdsData = read(combinedDs);
ptCld = cdsData{1};

To visualize the point cloud and image data, create a helperLidarObjectDetectionAndTrackingDisplay object and configure its display parameters.

displayObj = helperLidarObjectDetectionAndTrackingDisplay(XLimits=[-50 50],YLimits=[-50 50],ZLimits=[-3 10]);
initializeDisplay(displayObj,BackgroundColor=[0.3 0.3 0.3],PCFigureCamPos=[-20 7 40]);

Display the point cloud ptCld and its corresponding image.

updatePointCloudFigure(displayObj,ptCld,Title="Point Cloud Data");
img = cdsData{3};
updateImageFigure(displayObj,img,Title="Front View Image Data");

Figure Lidar Object Detection and Tracking contains 2 axes objects and other objects of type uipanel. Axes object 1 contains an object of type scatter. Hidden axes object 2 contains an object of type image.

Detect Vehicles from Lidar Point Cloud Data

Create a pretrained Voxel R-CNN object detector trained on the PandaSet point cloud data set.

lidarVehicleDetector = voxelRCNNObjectDetector("pandaset");

The pretrained model has been trained on point cloud data using a right-handed sensor coordinate system, where the positive x-axis points forward from the ego vehicle and the positive y-axis points to the left of the ego vehicle. However, the lidar point clouds in the PandaSet data set are in the world coordinate system. Therefore, you must use the lidar pose information to transform the point cloud data from the PandaSet sensor frame to the world frame.

reset(combinedDs);
ptClouds = pointCloud.empty(0,height(sensorData));
tforms = rigidtform3d.empty(0,height(sensorData));
for i = 1:height(sensorData)
    cdsData = read(combinedDs);
    ptCld = cdsData{1};
    tformSensorToWorld = cdsData{2};
    
    ptClouds(i) = pctransform(ptCld,invert(tformSensorToWorld));
    tforms(i) = tformSensorToWorld;
end

Specify these parameters for the lidar vehicle detection model.

  • threshold — Detection threshold, specified as a scalar in the range [0, 1]. The model ignores detections with a confidence score less than the threshold value. If you observe false detections, try increasing this value.

  • xLimits — Range of coordinates along the x-axis.

  • yLimits — Range of coordinates along the y-axis.

params.threshold = 0.5;
params.xLimits = [-12 12];
params.yLimits = [0 60];

Detect vehicles for each point cloud in ptClouds by using the helperDetectVehicles supporting function. This function uses the pretrained deep neural network model lidarVehicleDetector to detect vehicles from lidar data. Note that, depending on your hardware configuration, this function can take a significant amount of time to run.

vehicleDetections = helperDetectVehicles(lidarVehicleDetector,ptClouds,params);

vehicleDetections is an M-by-3 table containing the detected vehicle bounding boxes for each point cloud, their detection scores, and their labels. M is the total number of point clouds.

Select a point cloud, and extract the detected vehicle bounding boxes for that point cloud.

idx = 400;
ptCloud = ptClouds(idx);
bboxes = vehicleDetections.Bboxes{idx};

Overlay the vehicle detections on the point cloud, and display the overlaid point cloud.

indices = findPointsInROI(ptCloud,[params.xLimits params.yLimits -Inf Inf]);
ptCloud = select(ptCloud,indices,OutputSize="full");

displayObj = helperLidarObjectDetectionAndTrackingDisplay(XLimits=params.xLimits,YLimits=params.yLimits,ZLimits=[-3 10]);
initializeDisplay(displayObj,BackgroundColor=[0.3 0.3 0.3],PCFigureCamPos=[-10 -90  25]);
updatePointCloudFigure(displayObj,ptCloud,Title="Point Cloud Data with Detected Vehicles",Bboxes=bboxes,BboxColor="yellow");

Read the corresponding front-view camera image, and project the detected 3D vehicle bounding boxes onto the image frame using the helperProjectBboxesToImage function. Display the image with the projected bounding boxes.

img = imread(sensorData{idx,"frontCameraImagesPath"});
tform = tforms(idx);

projectedBoxes = helperProjectBboxesToImage(sensorData,idx,tform,bboxes);
updateImageFigure(displayObj,img,Title="Front View Image Data with Detected Vehicles",Bboxes=projectedBoxes,BboxColor="yellow");

Figure Lidar Object Detection and Tracking contains 2 axes objects and other objects of type uipanel. Axes object 1 contains an object of type scatter. Hidden axes object 2 contains an object of type image.

Generate Tracks from Detections

To track vehicle detections in this example, initialize the JPDA tracker using an interacting multiple model (IMM) filter with constant velocity and constant turn motion models.

Define a JPDA tracker using a trackerJPDA (Sensor Fusion and Tracking Toolbox) System object™ with these properties:

  • ConfirmationThreshold — Specified as [4 7], to confirm the track if the car is detected for at least 4 out of the past 7 detections.

  • DeletionThreshlod — Specified as [5 5], to remove a track if the track is missing from the 5 most recent consecutive detections.

  • AssignmentThreshold — Specified as [50 200], this sets the normalized maximum distance at which to associate a detection to a track.

vehicleTracker = trackerJPDA(FilterInitializationFcn=@helperLidarCuboidTrackingIMMFilter, ...
    TrackLogic="History", ...
    AssignmentThreshold=[50 200], ...
    ConfirmationThreshold=[4 7], ...
    DeletionThreshold=[5 5]);

Specify the measurement noise measNoise. The measurement noise describes the uncertainty in the measurements.

measNoise = blkdiag(0.25*eye(3),25,eye(3));

Process the vehicle detections for each frame, and assign tracks.

% Get the list of all bounding boxes for each frame from the detector output. 
detectedBboxes = vehicleDetections.Bboxes;

curTimeStamp = 0;
dt = 0.1;
initializeDisplay(displayObj,BackgroundColor=[0.3 0.3 0.3]);
for i = 1:height(sensorData)
    % Get the current frame from ptClouds.
    ptCloud = ptClouds(i);

    % Get the bounding boxes for the current frame.
    bboxes = detectedBboxes{i};

    % Read the front-view camera image.
    frontViewImage = imread(sensorData{i,"frontCameraImagesPath"});

    % Get the point cloud lidar sensor to world transform.
    tform = tforms(i);

    curTimeStamp = curTimeStamp + dt;

    % Convert the bounding boxes to object detections.
    measurements = helperAssembleDetections(bboxes,measNoise,curTimeStamp);

    % Return the updated tracks for the current timestamp.
    confirmedTracks = vehicleTracker(measurements,curTimeStamp);

    if ~isempty(confirmedTracks)
        % Update the display, if the tracks are not empty.

        % Extract bounding boxes from confirmedTracks
        [trackedBboxes,trackIds] = helperExractBboxesFromTracks(confirmedTracks);
        % Update the display with the current point cloud and tracked
        % bounding boxes with track ids.
        updatePointCloudFigure(displayObj,ptCloud,Bboxes=trackedBboxes,TrackIDs=trackIds);
        
        % Get the bounding boxes for the image and overlay them over the image.
        [projectedBoxesImage,validIndices] = helperProjectBboxesToImage(sensorData,i,tform,trackedBboxes);
        % Update the display with the current front-view image and
        % projected bounding boxes with track ids.
        updateImageFigure(displayObj,frontViewImage,Bboxes=projectedBoxesImage,TrackIDs=trackIds(validIndices));
    else
        % Update the display with the current point cloud and front-view image.
        updatePointCloudFigure(displayObj,ptCloud);
        updateImageFigure(displayObj,frontViewImage);
    end
    drawnow("limitrate");
end

Figure Lidar Object Detection and Tracking contains 2 axes objects and other objects of type uipanel. Axes object 1 contains an object of type scatter. Hidden axes object 2 contains an object of type image.

Supporting Functions

helperCreateCombinedDatastore — Create a CombinedDatastore object, from sensor data, containing the lidar point clouds, lidar poses, and camera images.

function combinedDs = helperCreateCombinedDatastore(sensorData)
% This function creates a combined datastore object containing lidar point
% clouds, lidar poses, and camera images.

% Create a datastore to store the lidar data.
pcFds = fileDatastore(sensorData.ptCldPath,ReadFcn=@pcread);

% Create an arrayDatastore to store the 3-D rigid transformations.
lidarPoses = rigidtform3d.empty;
for i = 1:size(sensorData,1)
    rot = quat2rotm(quaternion([sensorData.lidarHeading(i).w,sensorData.lidarHeading(i).x,sensorData.lidarHeading(i).y,sensorData.lidarHeading(i).z]));
    tran = [sensorData.lidarPosition(i).x sensorData.lidarPosition(i).y sensorData.lidarPosition(i).z];
    pose = rigidtform3d(rot,tran);
    lidarPoses(i,1) = pose;
end
posesAds = arrayDatastore(lidarPoses);

% Create an imageDatastore to store the front camera images.
imds = imageDatastore(sensorData.frontCameraImagesPath);

% Combine all the datastores into a combinedDatastore.
combinedDs = combine(pcFds,posesAds,imds);
end

helperDetectVehicles — Detect vehicles using a voxelRCNNObjectDetector object.

function vehicleDetections = helperDetectVehicles(lidarVehicleDetector,lidarData,params)
% This function takes the voxelRCNNObjectDetector object, lidar point cloud
% array, and detector parameters structure as input and returns the vehicle
% detections as output.

% Create a placeholder for detections.
vehicleDetections = struct;

% Set the point cloud region of interest.
roi = [params.xLimits params.yLimits -Inf Inf];

for i = 1:numel(lidarData)
    ptCloud = lidarData(i);

    % Crop the point cloud to the region of interest.
    indices = findPointsInROI(ptCloud,roi);
    ptCloud = select(ptCloud,indices,OutputSize="full");

    % Infer using the voxelRCNNObjectDetector detector to detect vehicles.
    [bboxes,scores,labels] = detect(lidarVehicleDetector,ptCloud,Threshold=params.threshold);

    vehicleDetections(i,1).Bboxes = bboxes;
    vehicleDetections(i,1).Scores = scores;
    vehicleDetections(i,1).Labels = labels;
end

% Create a detection results table.
vehicleDetections = struct2table(vehicleDetections);
end

helperAssembleDetections — Convert the detected vehicle bounding boxes to objectDetection objects, for use in the tracker.

function detections = helperAssembleDetections(bboxes,measNoise,timestamp)
% This function assembles bounding boxes as a cell array of objectDetection
% objects.
detections = {};
for i = 1:size(bboxes,1)
    % This example assumes that all vehicles are cars.
    classid = 1;
    
    % Remove the ego car from the detections.
    if(abs(bboxes(i,1))<=1 && abs(bboxes(i,2))<=1)
        % If x and y of bbox center <= 1 meters.
        continue
    end
    lidarModel = [bboxes(i,1:3) bboxes(i,end) bboxes(i,4:6)];
    
    detections{end+1} = objectDetection(timestamp, ...
        lidarModel',MeasurementNoise=measNoise, ...
        ObjectAttributes=struct("ClassID",classid)); %#ok<AGROW>
end
end

References

[1] Hesai and Scale. PandaSet. Accessed September 18, 2025. https://pandaset.org/. The PandaSet data set is provided under the CC-BY-4.0 license.

[2] Xiao, Pengchuan, Zhenlei Shao, Steven Hao, et al. “PandaSet: Advanced Sensor Suite Dataset for Autonomous Driving.” 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), IEEE, September 19, 2021, 3095–101. https://doi.org/10.1109/ITSC48978.2021.9565009.

See Also

| |

Topics