Object Detection in Point Clouds Using Deep Learning

3-D object detection has great significance in autonomous navigation, robotics, medicine, remote sensing, and augmented reality applications. Though point clouds provide rich 3-D information, object detection in point clouds is a challenging task due to the sparse and unstructured nature of data.

Using deep neural networks to detect objects in a point cloud provides fast and accurate results. A 3-D object detector network takes point cloud data as an input and produces 3-D bounding boxes around each detected object.

These are few popular methods of object detection based on the network input.

Input different point cloud views such as Bird's-eye-view (BEV), front-view, or image view to a network and regress 3-D bounding boxes. You can also fuse features from different views for more accurate detections.
Convert point cloud data into a more structured representation such as pillars or voxels, then apply a 3-D convolutional neural network to obtain bounding boxes. PointPillars and VoxelNet are widely popular networks using this method.
- VoxelNet converts the point cloud data into equally spaced voxels and encodes features within each voxel into a 4-D tensor. Then, obtains the detection results by using a region proposal network.
- PointPillars network uses PointNets to learn the features of the point cloud organized into vertical pillars. The network then encodes these features as pseudo images to predict bounding boxes by using a 2-D object detection pipeline. For more information, see Get Started with PointPillars.
Preprocess point cloud data to derive a 2-D representation, use a 2-D CNN to obtain 2-D bounding boxes. Then, project these 2-D boxes onto the point cloud data to obtain 3-D detection results.

Deep learning-based object detection in lidar point clouds.

Create Training Data for Object Detection

Training the network on large labeled datasets provides faster and more accurate detection results.

Use the Lidar Labeler app to interactively label point clouds and export label data for training. You can label cuboids, lines, and voxel regions inside a point clouds using the app. You can also add scene labels for point classification. For more information, see Get Started with the Lidar Labeler.

Augment and Preprocess Data

Using data augmentation techniques adds variety to the limited datasets. You can transform point clouds by translating, rotating, and adding new bounding boxes to the point cloud. This provides distinct point clouds for training. For more details, see Data Augmentations for Lidar Object Detection Using Deep Learning.

To convert unorganized point clouds into organized format, use the pcorganize function. For more information, see the Unorganized to Organized Conversion of Point Clouds Using Spherical Projection example.

When your network input is 2-D, you can use the ImageDatastore, PixelLabelDatastore, and boxLabelDatastore objects to divide and store the training and the test data. To store point clouds, use the fileDatastore object.

For aerial lidar data, use the blockedPointCloudDatastore and blockedPointCloud functions, respectively to store and process point cloud data as blocks.

For more information, see

Preprocess Data for Domain-Specific Deep Learning Applications (Deep Learning Toolbox)
Datastores for Deep Learning (Deep Learning Toolbox)

Create Object Detection Network

Define your network based on the network input and the layers. For a list of supported layers and how to create them, see the List of Deep Learning Layers (Deep Learning Toolbox). To visualize the network architecture, use the analyzeNetwork (Deep Learning Toolbox) function.

You can also design a network layer-by-layer interactively using the Deep Network Designer (Deep Learning Toolbox).

Use the pointPillarsObjectDetector object, to create a PointPillars object detector network.

Train Object Detector Network

To specify the training options, use the trainingOptions (Deep Learning Toolbox) function and you can train the network by using the trainNetwork (Deep Learning Toolbox) function.

Use the trainPointPillarsObjectDetector function to train a PointPillars network.

Detect Objects in Point Clouds Using Deep Learning Detectors and Pretrained Models

Use the detect function to detect objects using a PointPillars network.

To evaluate the detection results, use the evaluateObjectDetection and bboxOverlapRatio functions.

Lidar Toolbox™ provides these pretrained object detection models for PointPillars and Complex YOLOv4 networks. For more information, see

Code Generation

To learn how to generate CUDA^® code for a segmentation workflow, see these examples.

Code Generation for Lidar Object Detection Using SqueezeSegV2 Network

Code Generation for Lidar Object Detection Using PointPillars Deep Learning