objectDetectorTrainingData
Create training data for an object detector
Syntax
Description
[
creates an image datastore and a box label datastore training data from the
specified ground truth. The function only selects images from the ground truth
input that contain one or more annotated objects.imds
,blds
] = objectDetectorTrainingData(gTruth
)
You can combine the image and box label datastores using combine
(imds
,blds
) to
create a datastore needed for training. Use the combined datastore with the
training functions, such as trainACFObjectDetector
,
trainYOLOv2ObjectDetector
, trainYOLOv3ObjectDetector
, and trainYOLOv4ObjectDetector
.
This function supports parallel computing using multiple MATLAB® workers. Enable parallel computing using the Computer Vision Toolbox Preferences dialog.
returns a table of training data from the specified ground truth.
trainingDataTable
= objectDetectorTrainingData(gTruth
)gTruth
is an array of groundTruth
objects. You can use
the table to train an object detector using the Computer Vision Toolbox™ training functions.
[___] = objectDetectorTrainingData(
specifies options using one or more name-value arguments in addition to any
combination of arguments from previous syntaxes. For example,
gTruth
,Name=Value
)Verbose=True
enables display to the workspace
environment.
If you create the groundTruth
objects in
gTruth
using a video file, a custom data source, or an
imageDatastore
object with
different custom read functions, then you can specify any combination of
name-value arguments. If you create the groundTruth
objects
from an image collection or image sequence data source, then you can specify
only the SamplingFactor
and the
LabelType
name-value arguments.
Examples
Train YOLO v2 Vehicle Detector
Train a vehicle detector based on a YOLO v2 network.
Add the folder containing training images to the workspace.
imageDir = fullfile(matlabroot,"toolbox","vision","visiondata","vehicles"); addpath(imageDir);
Load the vehicle ground truth data.
data = load("vehicleTrainingGroundTruth.mat");
gTruth = data.vehicleTrainingGroundTruth;
Create an image datastore and box label datastore using the ground truth object.
[imds,bxds] = objectDetectorTrainingData(gTruth);
Combine the datastores.
cds = combine(imds,bxds);
Load the detector containing the dlnetwork
object for training.
load("yolov2VehicleDetectorNet.mat","net"); classes = "vehicle"; aboxes = [8 8; 32 48; 40 24; 72 48]; detector = yolov2ObjectDetector(net,classes,aboxes);
Configure training options.
options = trainingOptions("sgdm", ... InitialLearnRate=0.001, ... Verbose=true, ... MiniBatchSize=16, ... MaxEpochs=30, ... Shuffle="every-epoch", ... VerboseFrequency=10);
Train the detector.
[detector,info] = trainYOLOv2ObjectDetector(cds,detector,options);
************************************************************************* Training a YOLO v2 Object Detector for the following object classes: * vehicle Training on single CPU. |========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning | | | | (hh:mm:ss) | RMSE | Loss | Rate | |========================================================================================| | 1 | 1 | 00:00:00 | 7.17 | 51.4 | 0.0010 | | 1 | 10 | 00:00:03 | 1.61 | 2.6 | 0.0010 | | 2 | 20 | 00:00:06 | 1.36 | 1.9 | 0.0010 | | 2 | 30 | 00:00:10 | 1.23 | 1.5 | 0.0010 | | 3 | 40 | 00:00:13 | 1.10 | 1.2 | 0.0010 | | 3 | 50 | 00:00:16 | 1.09 | 1.2 | 0.0010 | | 4 | 60 | 00:00:19 | 0.84 | 0.7 | 0.0010 | | 4 | 70 | 00:00:23 | 0.87 | 0.8 | 0.0010 | | 5 | 80 | 00:00:26 | 0.76 | 0.6 | 0.0010 | | 5 | 90 | 00:00:29 | 0.80 | 0.6 | 0.0010 | | 6 | 100 | 00:00:32 | 0.79 | 0.6 | 0.0010 | | 7 | 110 | 00:00:35 | 0.58 | 0.3 | 0.0010 | | 7 | 120 | 00:00:38 | 0.62 | 0.4 | 0.0010 | | 8 | 130 | 00:00:41 | 0.64 | 0.4 | 0.0010 | | 8 | 140 | 00:00:45 | 0.65 | 0.4 | 0.0010 | | 9 | 150 | 00:00:48 | 0.57 | 0.3 | 0.0010 | | 9 | 160 | 00:00:51 | 0.58 | 0.3 | 0.0010 | | 10 | 170 | 00:00:54 | 0.62 | 0.4 | 0.0010 | | 10 | 180 | 00:00:57 | 0.55 | 0.3 | 0.0010 | | 11 | 190 | 00:01:00 | 0.60 | 0.4 | 0.0010 | | 12 | 200 | 00:01:03 | 0.49 | 0.2 | 0.0010 | | 12 | 210 | 00:01:06 | 0.56 | 0.3 | 0.0010 | | 13 | 220 | 00:01:09 | 0.55 | 0.3 | 0.0010 | | 13 | 230 | 00:01:12 | 0.48 | 0.2 | 0.0010 | | 14 | 240 | 00:01:15 | 0.51 | 0.3 | 0.0010 | | 14 | 250 | 00:01:19 | 0.48 | 0.2 | 0.0010 | | 15 | 260 | 00:01:22 | 0.50 | 0.2 | 0.0010 | | 15 | 270 | 00:01:25 | 0.59 | 0.4 | 0.0010 | | 16 | 280 | 00:01:28 | 0.54 | 0.3 | 0.0010 | | 17 | 290 | 00:01:31 | 0.48 | 0.2 | 0.0010 | | 17 | 300 | 00:01:34 | 0.36 | 0.1 | 0.0010 | | 18 | 310 | 00:01:37 | 0.50 | 0.2 | 0.0010 | | 18 | 320 | 00:01:40 | 0.49 | 0.2 | 0.0010 | | 19 | 330 | 00:01:44 | 0.44 | 0.2 | 0.0010 | | 19 | 340 | 00:01:47 | 0.44 | 0.2 | 0.0010 | | 20 | 350 | 00:01:50 | 0.44 | 0.2 | 0.0010 | | 20 | 360 | 00:01:53 | 0.51 | 0.3 | 0.0010 | | 21 | 370 | 00:01:56 | 0.48 | 0.2 | 0.0010 | | 22 | 380 | 00:01:59 | 0.50 | 0.3 | 0.0010 | | 22 | 390 | 00:02:02 | 0.52 | 0.3 | 0.0010 | | 23 | 400 | 00:02:05 | 0.46 | 0.2 | 0.0010 | | 23 | 410 | 00:02:09 | 0.37 | 0.1 | 0.0010 | | 24 | 420 | 00:02:12 | 0.45 | 0.2 | 0.0010 | | 24 | 430 | 00:02:15 | 0.39 | 0.2 | 0.0010 | | 25 | 440 | 00:02:18 | 0.41 | 0.2 | 0.0010 | | 25 | 450 | 00:02:21 | 0.36 | 0.1 | 0.0010 | | 26 | 460 | 00:02:24 | 0.41 | 0.2 | 0.0010 | | 27 | 470 | 00:02:28 | 0.48 | 0.2 | 0.0010 | | 27 | 480 | 00:02:31 | 0.40 | 0.2 | 0.0010 | | 28 | 490 | 00:02:34 | 0.44 | 0.2 | 0.0010 | | 28 | 500 | 00:02:37 | 0.39 | 0.2 | 0.0010 | | 29 | 510 | 00:02:40 | 0.25 | 6.1e-02 | 0.0010 | | 29 | 520 | 00:02:43 | 0.33 | 0.1 | 0.0010 | | 30 | 530 | 00:02:47 | 0.36 | 0.1 | 0.0010 | | 30 | 540 | 00:02:50 | 0.33 | 0.1 | 0.0010 | |========================================================================================| Training finished: Max epochs completed. Detector training complete. *************************************************************************
Read a test image.
I = imread("detectcars.png");
Run the detector.
[bboxes,scores] = detect(detector,I);
Display the results.
if(~isempty(bboxes)) I = insertObjectAnnotation(I,"rectangle",bboxes,scores); end figure imshow(I)
Train ACF-Based Stop Sign Detector
Use training data to train an ACF-based object detector for stop signs
Add the folder containing images to the MATLAB path.
imageDir = fullfile(matlabroot, 'toolbox', 'vision', 'visiondata', 'stopSignImages'); addpath(imageDir);
Load ground truth data, which contains data for stops signs and cars.
load('stopSignsAndCarsGroundTruth.mat','stopSignsAndCarsGroundTruth')
View the label definitions to see the label types in the ground truth.
stopSignsAndCarsGroundTruth.LabelDefinitions
ans=3×3 table
Name Type Group
____________ _________ ________
{'stopSign'} Rectangle {'None'}
{'carRear' } Rectangle {'None'}
{'carFront'} Rectangle {'None'}
Select the stop sign data for training.
stopSignGroundTruth = selectLabelsByName(stopSignsAndCarsGroundTruth,'stopSign');
Create the training data for a stop sign object detector.
trainingData = objectDetectorTrainingData(stopSignGroundTruth); summary(trainingData)
trainingData: 41x2 table Variables: imageFilename: cell array of character vectors stopSign: cell Statistics for applicable variables: NumMissing imageFilename 0 stopSign 0
Train an ACF-based object detector.
acfDetector = trainACFObjectDetector(trainingData,'NegativeSamplesFactor',2);
ACF Object Detector Training The training will take 4 stages. The model size is 34x31. Sample positive examples(~100% Completed) Compute approximation coefficients...Completed. Compute aggregated channel features...Completed. -------------------------------------------- Stage 1: Sample negative examples(~100% Completed) Compute aggregated channel features...Completed. Train classifier with 42 positive examples and 84 negative examples...Completed. The trained classifier has 19 weak learners. -------------------------------------------- Stage 2: Sample negative examples(~100% Completed) Found 84 new negative examples for training. Compute aggregated channel features...Completed. Train classifier with 42 positive examples and 84 negative examples...Completed. The trained classifier has 20 weak learners. -------------------------------------------- Stage 3: Sample negative examples(~100% Completed) Found 84 new negative examples for training. Compute aggregated channel features...Completed. Train classifier with 42 positive examples and 84 negative examples...Completed. The trained classifier has 54 weak learners. -------------------------------------------- Stage 4: Sample negative examples(~100% Completed) Found 84 new negative examples for training. Compute aggregated channel features...Completed. Train classifier with 42 positive examples and 84 negative examples...Completed. The trained classifier has 61 weak learners. -------------------------------------------- ACF object detector training is completed. Elapsed time is 18.4629 seconds.
Test the ACF-based detector on a sample image.
I = imread('stopSignTest.jpg');
bboxes = detect(acfDetector,I);
Display the detected object.
annotation = acfDetector.ModelName;
I = insertObjectAnnotation(I,'rectangle',bboxes,annotation);
figure
imshow(I)
Remove the image folder from the path.
rmpath(imageDir);
Read all label attributes from groundTruth
Load image locations, label definitions and label data.
data = load('labelsWithAttributes.mat'); images = fullfile(matlabroot,'toolbox','vision','visiondata','stopSignImages', data.imageFilenames);
Create a ground truth object.
dataSource = groundTruthDataSource(images); gTruth = groundTruth(groundTruthDataSource(images), data.labeldefs, data.labelData);
Create an image datastore, box label datastore, and array datastore using the ground truth object.
[imds, blds, arrds] = objectDetectorTrainingData(gTruth);
Read all attributes.
readall(arrds)
ans=2×1 cell array
{1x1 struct}
{1x1 struct}
Input Arguments
gTruth
— Ground truth data
scalar | array of groundTruth
objects
Ground truth data, specified as a scalar or an array of groundTruth
objects. You can
create ground truth objects from existing ground truth data by using the
groundTruth
object.
If you use custom data sources in groundTruth
with parallel computing enabled, then the reader
function is expected to work with a pool of MATLAB workers to read images from the data source in
parallel.
Note
If you specify gTruth
as an array of
groundTruth
objects, all label definitions must
have the same label names.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: (SamplingFactor
=5
) sets the subsampling factor
to 5
.
SamplingFactor
— Factor for subsampling images
"auto"
(default) | integer | vector of integers
Factor for subsampling images in the ground truth data source,
specified as "auto"
, an integer, or a vector of
integers. For a sampling factor of N, the returned
training data includes every Nth image in the ground
truth data source. The function ignores ground truth images with empty
label data. To set the SamplingFactor
with
projected cuboid data, you must specify the
LabelType
name-value argument to
labelType.ProjectedCuboid
.
Use sampled data to reduce repeated data, such as a sequence of images with the same scene and labels. It can also help in reducing training time.
Value | Sampling Factor |
---|---|
"auto" | The function samples data sources with
timestamps, such as a video, with a factor of
5 , and 1 for
a collection of images. |
Integer | Manually set the sampling factor to apply to all data. |
Vector of integers | When you input an array of ground truth objects, the function uses the sampling factor specified by the corresponding vector element. |
LabelType
— Type of label to extract from ground truth data
"labelType.Rectangle"
(default) | "labelType.RotatedRectangle"
| "labelType.ProjectedCuboid"
| character vector
Type of label to extract from ground truth data, specified as
"labelType.Rectangle"
,
"labelType.RotatedRectangle"
, or
"labelType.ProjectedCuboid"
. Use the type of
label consistent with the type of object detector you want to train.
Note
The trainYOLOv2ObjectDetector
function does not
support "labelType.RotatedRectangle"
.
WriteLocation
— Name of folder
pwd
(current working
folder) (default) | string scalar | character vector
Folder name to write extracted images to, specified as a string scalar or character vector. The specified folder must exist and have write permissions.
This argument applies only for:
groundTruth
objects created using a video file or a custom data source.An array of
groundTruth
objects created usingimageDatastore
, with different customread
functions.
The function ignores this argument when:
The input
groundTruth
object was created from an image sequence data source.The array of input
groundTruth
objects all contain image datastores using the same customread
function.Any of the input
groundTruth
objects containing datastores, use the defaultread
functions.
ImageFormat
— Image file format
"PNG"
(default) | string scalar | character vector
Image file format, specified as a string scalar or character vector. File formats must be
supported by imwrite
.
This argument applies only for:
groundTruth
objects created using a video file or a custom data source.An array of
groundTruth
objects created usingimageDatastore
with different customread
functions.
The function ignores this argument when:
The input
groundTruth
object was created from an image sequence data source.The array of input
groundTruth
objects all contain image datastores using the same customread
function.Any of the input
groundTruth
objects containing datastores, use the defaultread
functions.
NamePrefix
— Prefix for output image file names
string scalar | character vector
Prefix for output image file names, specified as a string scalar or character vector. The image files are named as:
<name_prefix><source_number>_<image_number>.<image_format>
The default value uses the name of the data source that the images
were extracted from, strcat(sourceName,"_")
, for
video and a custom data source, or "datastore"
, for
an image datastore.
This argument applies only for:
groundTruth
objects created using a video file or a custom data source.An array of
groundTruth
objects created usingimageDatastore
with different customread
functions.
The function ignores this argument when:
The input
groundTruth
object was created from an image sequence data source.The array of input
groundTruth
objects all contain image datastores using the same customread
function.Any of the input
groundTruth
objects containing datastores, use the defaultread
functions.
Verbose
— Flag to display training progress
true
(1
) (default) | false
(0
)
Flag to display training progress at the MATLAB command line,
specified as either true
(1
) or
false
(0
). This property
applies only for groundTruth
objects
created using a video file or a custom data source.
Output Arguments
imds
— Image datastore
imageDatastore
object
Image datastore, returned as an imageDatastore
object
containing images extracted from the gTruth
objects.
The images in imds
contain at least one class of
annotated labels. The function ignores images that are not annotated.
blds
— Box label datastore
boxLabelDatastore
object
Box label datastore, returned as a boxLabelDatastore
object. The datastore contains categorical
vectors for ROI label names and M-by-4 matrices of
M bounding boxes. The locations and sizes of the
bounding boxes are represented as double M-by-4 element
vectors in the format
[x,y,width,height].
arrds
— Array datastore
struct
array
Array datastore, returned as a struct
array. The fields
of the struct
contain the attributes and sublabel names
for the corresponding labels in the box label datastore
blds
. The sublabel data is packaged into the
struct
with a Position
field along
with the fields that correspond to the sublabel attributes.
trainingDataTable
— Training data table
table
Training data table, returned as a table with two or more columns. The
first column of the table contains image file names with paths. The images
can be grayscale or truecolor (RGB) and in any format supported by imread
. Each of the
remaining columns correspond to an ROI label and contains the locations of
bounding boxes in the image (specified in the first column), for that label.
The bounding boxes are specified as M-by-4 matrices of
M bounding boxes in the format
[x,y,width,height].
[x,y] specifies the upper-left
corner location. To create a ground truth table, you can use the Image
Labeler app or Video
Labeler app.
The output table ignores any sublabel or attribute data
present in the input gTruth
object.
Version History
Introduced in R2017aR2022b: Project cuboids from 3-D world coordinates to 2-D image coordinates
Updated to support 3-D projected cuboid labels.
Returns extracted attributes and sublabels as a third output. The attributes and sublabels are packaged as an array datastore.
See Also
Apps
Functions
trainACFObjectDetector
|trainYOLOv2ObjectDetector
|trainYOLOv3ObjectDetector
|trainYOLOv4ObjectDetector
|estimateAnchorBoxes
Objects
Topics
- Datastores for Deep Learning (Deep Learning Toolbox)
- Training Data for Object Detection and Semantic Segmentation
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)