Documentation

Using ExhaustiveSearcher Objects

Exhaustive nearest neighbors searcher

ExhustiveSearcher model objects store statistics and options for an exhaustive, nearest neighbors search. Statistics and options that you can store include the training data, the distance metric, and the parameter values of the distance metric. The exhaustive search algorithm finds the distance from each query observation to all n observations in the training data, which is an n-by-K numeric matrix.

Once you create an ExhaustiveSearcher model object, find neighboring points in the training data to the query data by performing a nearest neighbors search using knnsearch or a radius search using rangesearch. The exhaustive search algorithm is more efficient than the Kd-tree algorithm when K is large (i.e., K ≥ 10), and it is more flexible than the Kd-tree algorithm with respect to distance metric choices. The algorithm also supports sparse data.

Examples

collapse all

Train Default Exhaustive Nearest Neighbors Searcher

Load Fisher's iris data set.

load fisheriris
X = meas;
[n,k] = size(X)
n =

   150


k =

     4

X has 150 observations and 4 predictors.

Prepare an exhaustive nearest neighbors searcher using the entire data set as training data.

Mdl = ExhaustiveSearcher(X)
Mdl = 

  ExhaustiveSearcher with properties:

         Distance: 'euclidean'
    DistParameter: []
                X: [150x4 double]

Mdl is an ExhaustiveSearcher model object, and its properties appear in the Command Window. It contains information about the trained algorithm, such as the distance metric. You can alter property values using dot notation.

To search X for the nearest neighbors to a batch of query data, pass Mdl and the query data to knnsearch or rangesearch.

Alter Properties of ExhaustiveSearcher Model

Load Fisher's iris data set.

load fisheriris
X = meas;

Train a default exhaustive searcher algorithm using the entire data set as training data.

Mdl = ExhaustiveSearcher(X)
Mdl = 

  ExhaustiveSearcher with properties:

         Distance: 'euclidean'
    DistParameter: []
                X: [150x4 double]

Specify that the neighbor searcher use the Mahalanobis metric to compute the distances between the training and query data.

Mdl.Distance = 'mahalanobis'
Mdl = 

  ExhaustiveSearcher with properties:

         Distance: 'mahalanobis'
    DistParameter: [4x4 double]
                X: [150x4 double]

Pass Mdl and the query data to either knnsearch or rangesearch to find the nearest neighbors to the points in the query data using the Mahalanobis distance.

Search for Nearest Neighbors of Query Data Using the Mahalanobis Distance

Load Fisher's iris data set.

load fisheriris

Remove five irises randomly from the predictor data to use as a query set.

rng(1);                     % For reproducibility
n = size(meas,1);           % Sample size
qIdx = randsample(n,5);     % Indices of query data
X = meas(~ismember(1:n,qIdx),:);
Y = meas(qIdx,:);

Prepare an exhaustive nearest neighbors searcher using the training data. Specify to use the Mahalanobis distance for finding nearest neighbors later.

Mdl = createns(X,'NSMethod','exhaustive','Distance','mahalanobis')
Mdl = 

  ExhaustiveSearcher with properties:

         Distance: 'mahalanobis'
    DistParameter: [4x4 double]
                X: [145x4 double]

Mdl is an ExhaustiveSearcher model object. By default, the Mahalanobis metric parameter value is the estimated covariance matrix of the predictors (columns) in the training data. To display this value,use Mdl.DistPatameter.

Mdl.DistParameter
ans =

    0.6819   -0.0332    1.2526    0.5103
   -0.0332    0.1859   -0.3152   -0.1183
    1.2526   -0.3152    3.0638    1.2816
    0.5103   -0.1183    1.2816    0.5786

Find the indices of the training data (Mdl.X) that are the two nearest neighbors of each point in the query data (Q).

IdxNN = knnsearch(Mdl,Y,'K',2)
IdxNN =

    26    38
     6    21
     1    34
    84    76
    69   129

Each row of NN corresponds to a query data observation. The column order corresponds to the order of the nearest neighbors with respect to ascending distance. For example, using the Mahalanobis metric, the second nearest neighbor of Q(3,:) is X(34,:).

Properties

collapse all

DistanceDistance metric'cityblock' | 'euclidean' | 'mahalanobis' | 'minkowski' | 'seuclidean' | custom distance function | ...

Distance metric used to find nearest neighbors of query points, specified as a string or function handle.

This table describes the supported distance metrics specified by strings.

ValueDescription
'chebychev'Chebychev distance (maximum coordinate difference)
'cityblock'City block distance
'correlation'One minus the sample linear correlation between observations (treated as sequences of values)
'cosine'One minus the cosine of the included angle between observations (row vectors)
'euclidean'Euclidean distance
'hamming'Hamming distance, which is the percentage of coordinates that differ
'jaccard'One minus the Jaccard coefficient, which is the percentage of nonzero coordinates that differ
'minkowski'Minkowski distance
'mahalanobis'Mahalanobis distance
'seuclidean'Standardized Euclidean distance
'spearman'One minus the sample Spearman's rank correlation between observations (treated as sequences of values)

For more details, see Distance Metrics.

You can specify a function handle for a custom distance metric using @ (for example, @distfun). A custom distance function must:

  • Have the form function D2 = distfun(ZI, ZJ)

  • Take as arguments:

    • A 1-by-n vector ZI containing a single row from X or from the query points Y

    • An m-by-n matrix ZJ containing multiple rows of X or Y

  • Return an m-by-1 vector of distances D2, whose jth element is the distance between the observations ZI and ZJ(j,:)

The software does not use the distance metric for training the exhaustive searcher algorithm. Therefore, you can alter it after training by specifying a supported string or function handle for a custom function using dot notation. For example, to specify the Mahalanobis distance, enter Mdl.Distance = 'mahalanobis'.

Data Types: char | function_handle

DistParameterDistance metric parameter values[] | positive scalar

Distance metric parameter values, specified as empty ([]) or as a positive scalar.

This table describes the distance parameters of the supported distance metrics.

Distance MetricParameter Description
'mahalanobis'A positive definite matrix representing the covariance matrix used for computing the Mahalanobis distances. By default, the software sets the covariance using nancov(Mdl.X). You can alter the scale parameter using dot notation, e.g., Mdl.DistParameter = CovNew, where CovNew is a K-by-K positive definite numeric matrix.
'minkowski'A positive scalar indicating the exponent of the Minkowski distances. By default, the exponent is 2.
'seuclidean'A positive, numeric vector indicating the values that the software uses to scale the predictors when computing the standardized Euclidean distances. By default, the software:
  1. Estimates the standard deviation of each predictor (column) of X using scale = nanstd(Mdl.X).

  2. Scales each coordinate difference between the rows in X and the query matrix by dividing by the corresponding element of scale.

You can alter the scale parameter using dot notation, e.g., Mdl.DistParameter = sNew, where sNew is a K-dimensional positive numeric vector.

If Mdl.Distance is not one of the parameters listed in this table, then Mdl.DistParameter is [], which means that the specified distance metric formula has no parameters.

Data Types: single | double

XTraining datanumeric matrix

This property is read only.

Training data that prepares the exhaustive searcher algorithm, specified as a numeric matrix. X has n rows, each corresponding to an observation (i.e., instance or example), and K columns, each corresponding to a predictor or feature.

Data Types: single | double

Object Functions

knnsearch k-nearest neighbors search using Kd-tree or exhaustive search
rangesearch Find all neighbors within specified distance using exhaustive search or Kd-tree

Create Object

Create an ExhaustiveSearcher model object using ExhaustiveSearcher or createns.

Was this topic helpful?