rangesearch

Find nearest neighbors by edit distance range

Description

example

idx = rangesearch(eds,words,maxDist) finds all the words in eds that are within distance maxDist of the words in words.

example

[idx,d] = rangesearch(eds,words,maxDist) also returns the edit distances of the corresponding words.

Examples

collapse all

Create an edit distance searcher and specify a maximum edit distance of 3.

vocabulary = ["MathWorks" "MATLAB" "Simulink" "text" "analytics" "analysis"];
maxDist = 3;
eds = editDistanceSearcher(vocabulary,maxDist);

Find the nearest words to "MALTAB" and "MatWorks" with edit distance less than or equal to 1.

words = ["MALTAB" "MatWorks" "analytcs"];
maxDist = 1;
idx = rangesearch(eds,words,maxDist)
idx=3×1 cell
    {1x0 double}
    {[       1]}
    {[       5]}

For "MALTAB", there are no words in the searcher within the specified range. For "MatWorks" and "analytics", there is one result. View the corresponding word for "MatWorks" using the returned index.

nearestWords = eds.Vocabulary(idx{2})
nearestWords = 
"MathWorks"

Find the nearest words to "MALTAB", "MatWorks", and "analytcs" with edit distance less than or equal to 3 and their corresponding edit distances.

words = ["MALTAB" "MatWorks" "analytcs"];
maxDist = 3;
[idx,d] = rangesearch(eds,words,maxDist)
idx=3×1 cell
    {[       2]}
    {[       1]}
    {1x2 double}

d=3×1 cell
    {[       2]}
    {[       1]}
    {1x2 double}

For both "MALTAB" and "MatWorks", there is one word in the searcher within the specified range. For "analytcs", there are two results. View the corresponding words for "analytcs" using the returned indices and their edit distances.

nearestWords = eds.Vocabulary(idx{3})
nearestWords = 1x2 string array
    "analytics"    "analysis"

d{3}
ans = 1×2

     1     2

Input Arguments

collapse all

Edit distance searcher, specified as an editDistanceSearcher object.

Input words, specified as a string vector, character vector, or cell array of character vectors. If you specify words as a character vector, then the function treats the argument as a single word.

Data Types: string | char | cell

Maximum search distance, specified as a non-negative number.

The function finds the indices of the words in eds whose edit distance to the elements of words are fewer than or equal to maxDist, sorted in the ascending order edit distance.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Output Arguments

collapse all

Indices of nearest neighbors in the searcher, returned as a cell array of vectors.

idx{i} is a vector of indices of the words in eds whose edit distance to words(i) is less than or equal to maxDist, sorted in the ascending order edit distance.

Data Types: cell

Edit distances to neighbors, returned as a cell array of vectors.

d{i} is a vector of edit distances between words(i) and the corresponding words in eds given by the vocabulary indices idx{i}.

Data Types: cell

Introduced in R2019a