Using ExhaustiveSearcher Objects

Exhaustive nearest neighbors searcher

`ExhustiveSearcher` model objects store statistics and options for an exhaustive, nearest neighbors search. Statistics and options that you can store include the training data, the distance metric, and the parameter values of the distance metric. The exhaustive search algorithm finds the distance from each query observation to all n observations in the training data, which is an n-by-K numeric matrix.

Once you create an `ExhaustiveSearcher` model object, find neighboring points in the training data to the query data by performing a nearest neighbors search using `knnsearch` or a radius search using `rangesearch`. The exhaustive search algorithm is more efficient than the Kd-tree algorithm when K is large (i.e., K ≥ 10), and it is more flexible than the Kd-tree algorithm with respect to distance metric choices. The algorithm also supports sparse data.

Examples

collapse all

Train Default Exhaustive Nearest Neighbors Searcher

```load fisheriris X = meas; [n,k] = size(X) ```
```n = 150 k = 4 ```

`X` has 150 observations and 4 predictors.

Prepare an exhaustive nearest neighbors searcher using the entire data set as training data.

```Mdl = ExhaustiveSearcher(X) ```
```Mdl = ExhaustiveSearcher with properties: Distance: 'euclidean' DistParameter: [] X: [150x4 double] ```

`Mdl` is an `ExhaustiveSearcher` model object, and its properties appear in the Command Window. It contains information about the trained algorithm, such as the distance metric. You can alter property values using dot notation.

To search `X` for the nearest neighbors to a batch of query data, pass `Mdl` and the query data to `knnsearch` or `rangesearch`.

Alter Properties of `ExhaustiveSearcher` Model

```load fisheriris X = meas; ```

Train a default exhaustive searcher algorithm using the entire data set as training data.

```Mdl = ExhaustiveSearcher(X) ```
```Mdl = ExhaustiveSearcher with properties: Distance: 'euclidean' DistParameter: [] X: [150x4 double] ```

Specify that the neighbor searcher use the Mahalanobis metric to compute the distances between the training and query data.

```Mdl.Distance = 'mahalanobis' ```
```Mdl = ExhaustiveSearcher with properties: Distance: 'mahalanobis' DistParameter: [4x4 double] X: [150x4 double] ```

Pass `Mdl` and the query data to either `knnsearch` or `rangesearch` to find the nearest neighbors to the points in the query data using the Mahalanobis distance.

Search for Nearest Neighbors of Query Data Using the Mahalanobis Distance

```load fisheriris ```

Remove five irises randomly from the predictor data to use as a query set.

```rng(1); % For reproducibility n = size(meas,1); % Sample size qIdx = randsample(n,5); % Indices of query data X = meas(~ismember(1:n,qIdx),:); Y = meas(qIdx,:); ```

Prepare an exhaustive nearest neighbors searcher using the training data. Specify to use the Mahalanobis distance for finding nearest neighbors later.

```Mdl = createns(X,'NSMethod','exhaustive','Distance','mahalanobis') ```
```Mdl = ExhaustiveSearcher with properties: Distance: 'mahalanobis' DistParameter: [4x4 double] X: [145x4 double] ```

`Mdl` is an `ExhaustiveSearcher` model object. By default, the Mahalanobis metric parameter value is the estimated covariance matrix of the predictors (columns) in the training data. To display this value,use `Mdl.DistPatameter`.

```Mdl.DistParameter ```
```ans = 0.6819 -0.0332 1.2526 0.5103 -0.0332 0.1859 -0.3152 -0.1183 1.2526 -0.3152 3.0638 1.2816 0.5103 -0.1183 1.2816 0.5786 ```

Find the indices of the training data (`Mdl.X`) that are the two nearest neighbors of each point in the query data (`Q`).

```IdxNN = knnsearch(Mdl,Y,'K',2) ```
```IdxNN = 26 38 6 21 1 34 84 76 69 129 ```

Each row of `NN` corresponds to a query data observation. The column order corresponds to the order of the nearest neighbors with respect to ascending distance. For example, using the Mahalanobis metric, the second nearest neighbor of `Q(3,:)` is `X(34,:)`.

Properties

collapse all

`Distance` — Distance metric`'cityblock'` | `'euclidean'` | `'mahalanobis'` | `'minkowski'` | `'seuclidean'` | custom distance function | ...

Distance metric used to find nearest neighbors of query points, specified as a string or function handle.

This table describes the supported distance metrics specified by strings.

ValueDescription
`'chebychev'`Chebychev distance (maximum coordinate difference)
`'cityblock'`City block distance
`'correlation'`One minus the sample linear correlation between observations (treated as sequences of values)
`'cosine'`One minus the cosine of the included angle between observations (row vectors)
`'euclidean'`Euclidean distance
`'hamming'`Hamming distance, which is the percentage of coordinates that differ
`'jaccard'`One minus the Jaccard coefficient, which is the percentage of nonzero coordinates that differ
`'minkowski'`Minkowski distance
`'mahalanobis'`Mahalanobis distance
`'seuclidean'`Standardized Euclidean distance
`'spearman'`One minus the sample Spearman's rank correlation between observations (treated as sequences of values)

For more details, see Distance Metrics.

You can specify a function handle for a custom distance metric using `@` (for example, `@distfun`). A custom distance function must:

• Have the form `function D2 = distfun(ZI, ZJ)`

• Take as arguments:

• A 1-by-n vector `ZI` containing a single row from `X` or from the query points `Y`

• An m-by-n matrix `ZJ` containing multiple rows of `X` or `Y`

• Return an m-by-1 vector of distances `D2`, whose jth element is the distance between the observations `ZI` and `ZJ`(j`,:)`

The software does not use the distance metric for training the exhaustive searcher algorithm. Therefore, you can alter it after training by specifying a supported string or function handle for a custom function using dot notation. For example, to specify the Mahalanobis distance, enter `Mdl.Distance = 'mahalanobis'`.

Data Types: `char` | `function_handle`

`DistParameter` — Distance metric parameter values`[]` | positive scalar

Distance metric parameter values, specified as empty (`[]`) or as a positive scalar.

This table describes the distance parameters of the supported distance metrics.

Distance MetricParameter Description
`'mahalanobis'`A positive definite matrix representing the covariance matrix used for computing the Mahalanobis distances. By default, the software sets the covariance using `nancov(Mdl.X)`. You can alter the scale parameter using dot notation, e.g., ```Mdl.DistParameter = CovNew```, where `CovNew` is a K-by-K positive definite numeric matrix.
`'minkowski'`A positive scalar indicating the exponent of the Minkowski distances. By default, the exponent is `2`.
`'seuclidean'`A positive, numeric vector indicating the values that the software uses to scale the predictors when computing the standardized Euclidean distances. By default, the software:
1. Estimates the standard deviation of each predictor (column) of `X` using `scale = nanstd(Mdl.X)`.

2. Scales each coordinate difference between the rows in `X` and the query matrix by dividing by the corresponding element of `scale`.

You can alter the scale parameter using dot notation, e.g., `Mdl.DistParameter = sNew`, where `sNew` is a K-dimensional positive numeric vector.

If `Mdl.Distance` is not one of the parameters listed in this table, then `Mdl.DistParameter` is `[]`, which means that the specified distance metric formula has no parameters.

Data Types: `single` | `double`

`X` — Training datanumeric matrix

Training data that prepares the exhaustive searcher algorithm, specified as a numeric matrix. `X` has n rows, each corresponding to an observation (i.e., instance or example), and K columns, each corresponding to a predictor or feature.

Data Types: `single` | `double`

Object Functions

 knnsearch k-nearest neighbors search using Kd-tree or exhaustive search rangesearch Find all neighbors within specified distance using exhaustive search or Kd-tree

Create Object

Create an `ExhaustiveSearcher` model object using `ExhaustiveSearcher` or `createns`.