balanceBoxLabels
Syntax
Description
balances bounding box labels, locationSet
= balanceBoxLabels(boxLabels
,blockedImages
,blockSize
,numObservations
)boxLabels
, by oversampling blocks of
images containing less frequent classes, contained in the collection of blocked image
objects blockedImages
. numObservations
is the
required number of block locations, and blockSize
specifies the block
size.
specifies options using one or more name-value arguments in addition to any combination of
arguments from previous syntaxes. For example, locationSet
= balanceBoxLabels(boxLabels
,blockedImages
,blockSize
,numObservations
,Name=Value
)OverlapThreshold=0.5
specifies the overlap threshold between a bounding box and a cropping window to before boxes
are clipped or discarded.
Examples
Sample Block Sets to Use in Blocked Image Object Detection
Load box labels data that contains boxes and labels for one image. The height and width of each box is 20-by-20 pixels.
d = load("balanceBoxLabelsData.mat");
boxLabels = d.BoxLabels;
Create a blocked image of size 500-by-500 pixels.
blockedImages = blockedImage(zeros([500 500]));
Choose the images size of each observation.
blockSize = [50 50];
Visualize using a histogram to identify any class imbalance in the box labels.
blds = boxLabelDatastore(boxLabels);
datasetCount = countEachLabel(blds);
figure
unbalancedLabels = datasetCount.Label;
unbalancedCount = datasetCount.Count;
h1 = histogram(Categories=unbalancedLabels,BinCounts=unbalancedCount);
title("Unbalanced Class Labels")
Measure the distribution of box labels. If the coefficient of variation is more than 1, then there is class imbalance.
cvBefore = std(datasetCount.Count)/mean(datasetCount.Count)
cvBefore = 1.5746
Choose a heuristic value for number of observations by finding the mean of the counts of each class, multiplied by the number of classes.
numClasses = height(datasetCount); numObservations = mean(datasetCount.Count) * numClasses;
Control the amount a box can be cut using OverlapThreshold
. Using a lower threshold value will cut objects more at the border of a block. Increase this value to reduce the amount an object can be clipped at the border, at the expense of a less balanced box labels.
ThresholdValue = 0.5;
Balance boxLabels
using the balanceBoxLabels
function.
locationSet = balanceBoxLabels(boxLabels,blockedImages,blockSize, ...
numObservations,OverlapThreshold=ThresholdValue);
[==================================================] 100% Elaps[==================================================] 100% Elapsed time: 00:00:00 Estimated time remaining: 00:00:00 Balancing box labels complete.
Count the labels that are contained within the image blocks.
bldsBalanced = boxLabelDatastore(boxLabels,locationSet); balancedDatasetCount = countEachLabel(bldsBalanced);
Overlay another histogram against the original label count to see if the box labels are balanced. If the labels appear to be not balanced by looking at the histograms, increase the value for numObservations
.
hold on balancedLabels = balancedDatasetCount.Label; balancedCount = balancedDatasetCount.Count; h2 = histogram(Categories=balancedLabels,BinCounts=balancedCount); title(h2.Parent,"Balanced Class Labels (OverlapThreshold: " + ThresholdValue + ")" ) legend(h2.Parent,["Before" "After"])
Measure the distribution of the new balanced box labels.
cvAfter = std(balancedCount)/mean(balancedCount)
cvAfter = 0.4588
Input Arguments
boxLabels
— Labeled bounding box data
table with two columns
Labeled bounding box data, specified as a table with two columns.
The first column contains either all rectangle or all rotated rectangle bounding boxes.
The second column must be a cell vector that contains the label names corresponding to each bounding box. Each element in the cell vector must be an M-by-1 categorical or string vector.
The table describes the format of the bounding boxes:
Bounding Box | Description |
---|---|
rectangle |
Defined in spatial coordinates as an M-by-4 numeric matrix with rows of the form [x y w h], where:
|
rotated-rectangle |
Defined in spatial coordinates as an M-by-5 numeric matrix with rows of the form [xctr yctr w h yaw], where:
|
To create a box label table from ground truth data,
Use the Image Labeler or Video Labeler app to label your ground truth. Export the labeled ground truth data to your workspace.
Create a bounding box label datastore using the
objectDetectorTrainingData
function.You can obtain the
boxLabels
from theLabelData
property of the box label datastore returned byobjectDetectorTrainingData
, (blds.LabelData
).
blockedImages
— Labeled blocked images
array of blockedImage
objects
Labeled blocked images, specified as an array of blockedImage
objects containing pixel label images.
blockSize
— Block size
two-element row vector of positive integers
Block size of read data, specified as a two-element row vector of positive integers, [numrows,numcols]. The first element specifies the number of rows in the block. The second element specifies the number of columns.
numObservations
— Number of block locations
positive integer
Number of block locations to return, specified as a positive integer.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: (OverlapThreshold=0.5)
specifies the overlap threshold
between a bounding box and a cropping window to before boxes are clipped or
discarded.
Levels
— Resolution level of each image
1
(default) | positive integer scalar |
B-by-1 vector of positive integers
Resolution level of each image in the array of blockedImage
objects, specified as a positive integer scalar or a B-by-1 vector
of positive integers, where B is the length of the array of
blockedImage
objects.
OverlapThreshold
— Overlap threshold
1
(default) | scalar in the range [0,1]
Overlap threshold, specified as a positive scalar in the range [0,1]. When the
overlap between a bounding box and a cropping window is greater than the threshold,
boxes in the boxLabels
input are clipped to the image block
window border. When the overlap is less than the threshold, the boxes are discarded.
When you lower the threshold, part of an object can get discarded. To reduce the
amount an object can be clipped at the border, increase the threshold. Increasing the
threshold can also cause less-balanced box labels.
The amount of overlap between the bounding box and a cropping window is defined as.
Verbose
— Display progress information
true
or 1
(default) | false
or 0
Display progress information, specified as a numeric or logical
1
(true
) or 0
(false
). Set this property to true
to display
information.
Output Arguments
locationSet
— Balanced box labels
blockLocationSet
object
Balanced box labels, returned as a blockLocationSet
object. The object contains
numObservations
number of locations of balanced blocks, each of
size blockSize
.
Algorithms
Balancing Box Labels
To balance box labels, the function over samples classes that are less represented in
the blocked image or big image. The box labels are counted across the dataset and sorted
based on each class count. Each image size is split into several quadrants, based on the
blockSize
input value. The algorithm randomly picks several blocks
within each quadrant with less-represented classes. The blocks without any objects are
discarded. The balancing stops once the specified number of blocks are selected.
Checking for Balance
You can check the success of balancing by comparing the histograms of label count before and after balancing. You can also check the coefficient of variation value. For best results, the value should be less than the original value. For more information, see the National Institute of Standards and Technology (NIST) website, see Coefficient of Variation for more information.
Version History
Introduced in R2020aR2021a: bigLabeledImages
argument is not recommended
The bigLabeledImages
argument, which supports
bigimage
objects, is not recommended. Use the
blockedImages
argument instead, which supports blockedImage
objects. The blockedImage
object offers several advantages including
extension to N-D processing, a simpler interface, and custom support for reading and writing
nonstandard image formats.
Although there are no plans to remove the bigLabeledImages
argument
at this time, switch to the blockedImages
argument to take advantage of
the additional capabilities and flexibility.
To update your code, follow these steps:
Replace
bigimage
object input withblockedImage
object input for the second argument of this function.If you want to select blocks of any of the blocked images at a resolution level other than 1, then specify the '
Levels
' name-value argument. You can omit this argument when you want to select blocks from all blocked images at resolution level 1.
The table gives an example of how to update your code.
Discouraged Usage | Recommended Replacement |
---|---|
This example selects blocks at resolution level 1 from a
boxLabels = load('balanceBoxLabelsData.mat').BoxLabels; bim = bigimage(zeros([500,500])); blockSize = [50 50]; numObservations = 20; locationSet = balanceBoxLabels(boxLabels,bim,1, ... blockSize,numObservations); | Here is equivalent code, replacing the input boxLabels = load('balanceBoxLabelsData.mat').BoxLabels; bim = blockedImage(zeros([500,500])); blockSize = [50 50]; numObservations = 20; locationSet = balanceBoxLabels(boxLabels,bim, ... blockSize,numObservations); |
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: United States.
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)