# edge

Classification edge

## Description

returns the classification edge for `E`

= edge(`tree`

,`TBL`

,`ResponseVarName`

)`tree`

with data
`TBL`

and classification
`TBL.ResponseVarName`

.

computes the edge with additional options specified by one or more
`E`

= edge(___,`Name,Value`

)`Name,Value`

pair arguments, using any of the previous
syntaxes. For example, you can specify observation weights.

## Input Arguments

`tree`

— Trained classification tree

`ClassificationTree`

model object | `CompactClassificationTree`

model object

Trained classification tree, specified as a `ClassificationTree`

or `CompactClassificationTree`

model
object. That is, `tree`

is a trained classification
model returned by `fitctree`

or `compact`

.

`TBL`

— Sample data

table

Sample data, specified as a table. Each row of `TBL`

corresponds
to one observation, and each column corresponds to one predictor variable.
Optionally, `TBL`

can contain additional columns
for the response variable and observation weights. `TBL`

must
contain all the predictors used to train `tree`

.
Multicolumn variables and cell arrays other than cell arrays of character
vectors are not allowed.

If `TBL`

contains the response variable
used to train `tree`

, then you do not need to specify `ResponseVarName`

or `Y`

.

If you train `tree`

using sample data contained
in a `table`

, then the input data for this method
must also be in a table.

**Data Types: **`table`

`X`

— Data to classify

numeric matrix

`ResponseVarName`

— Response variable name

name of a variable in `TBL`

Response variable name, specified as the name of a variable
in `TBL`

. If `TBL`

contains
the response variable used to train `tree`

, then
you do not need to specify `ResponseVarName`

.

If you specify `ResponseVarName`

, then you must do so as a character vector
or string scalar. For example, if the response variable is stored as
`TBL.Response`

, then specify it as `'Response'`

.
Otherwise, the software treats all columns of `TBL`

, including
`TBL.ResponseVarName`

, as predictors.

The response variable must be a categorical, character, or string array, logical or numeric vector, or cell array of character vectors. If the response variable is a character array, then each element must correspond to one row of the array.

**Data Types: **`char`

| `string`

`Y`

— Class labels

categorical array | character array | string array | logical vector | numeric vector | cell array of character vectors

Class labels, specified as a categorical, character, or string array, a logical or numeric
vector, or a cell array of character vectors. `Y`

must be of the same
type as the classification used to train `tree`

, and its number of
elements must equal the number of rows of `X`

.

**Data Types: **`categorical`

| `char`

| `string`

| `logical`

| `single`

| `double`

| `cell`

### Name-Value Arguments

Specify optional pairs of arguments as
`Name1=Value1,...,NameN=ValueN`

, where `Name`

is
the argument name and `Value`

is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.

*
Before R2021a, use commas to separate each name and value, and enclose*
`Name`

*in quotes.*

`Weights`

— Observation weights

`ones(size(X,1),1)`

(default) | name of a variable in `TBL`

| numeric vector

Observation weights, specified as the comma-separated pair consisting
of `'Weights'`

and a numeric vector or the name of a
variable in `TBL`

.

If you specify `Weights`

as a numeric vector, then
the size of `Weights`

must be equal to the number of
rows in `X`

or `TBL`

.

If you specify `Weights`

as the name of a variable
in `TBL`

, you must do so as a character vector or
string scalar. For example, if the weights are stored as
`TBL.W`

, then specify it as `'W'`

.
Otherwise, the software treats all columns of `TBL`

,
including `TBL.W`

, as predictors.

If you supply weights, `edge`

computes the weighted
classification
edge. The software weights the observations in each row of
`X`

or `TBL`

with the
corresponding weight in `Weights`

.

**Data Types: **`single`

| `double`

| `char`

| `string`

## Output Arguments

`E`

— Classification edge

scalar value

Classification edge, returned as a scalar representing the weighted average value of the margin.

## Examples

Compute the classification margin and edge for the Fisher iris data, trained on its first two columns of data, and view the last 10 entries:

```
load fisheriris
X = meas(:,1:2);
tree = fitctree(X,species);
E = edge(tree,X,species)
E =
0.6299
M = margin(tree,X,species);
M(end-10:end)
```

ans = 0.1111 0.1111 0.1111 -0.2857 0.6364 0.6364 0.1111 0.7500 1.0000 0.6364 0.2000

The classification tree trained on all the data is better.

tree = fitctree(meas,species); E = edge(tree,meas,species) E = 0.9384 M = margin(tree,meas,species); M(end-10:end)

ans = 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565

## More About

### Margin

The classification *margin* is the difference between the
classification *score* for the true class and maximal
classification score for the false classes. Margin is a column vector with the same
number of rows as the matrix `X`

.

### Score (tree)

For trees, the *score* of a classification
of a leaf node is the posterior probability of the classification
at that node. The posterior probability of the classification at a
node is the number of training sequences that lead to that node with
the classification, divided by the number of training sequences that
lead to that node.

For example, consider classifying a predictor `X`

as `true`

when `X`

< `0.15`

or `X`

> `0.95`

, and `X`

is
false otherwise.

Generate 100 random points and classify them:

rng(0,'twister') % for reproducibility X = rand(100,1); Y = (abs(X - .55) > .4); tree = fitctree(X,Y); view(tree,'Mode','Graph')

Prune the tree:

tree1 = prune(tree,'Level',1); view(tree1,'Mode','Graph')

The pruned tree correctly classifies observations that are less
than 0.15 as `true`

. It also correctly classifies
observations from .15 to .94 as `false`

. However,
it incorrectly classifies observations that are greater than .94 as `false`

.
Therefore, the score for observations that are greater than .15 should
be about .05/.85=.06 for `true`

, and about .8/.85=.94
for `false`

.

Compute the prediction scores for the first 10 rows of `X`

:

[~,score] = predict(tree1,X(1:10)); [score X(1:10,:)]

`ans = `*10×3*
0.9059 0.0941 0.8147
0.9059 0.0941 0.9058
0 1.0000 0.1270
0.9059 0.0941 0.9134
0.9059 0.0941 0.6324
0 1.0000 0.0975
0.9059 0.0941 0.2785
0.9059 0.0941 0.5469
0.9059 0.0941 0.9575
0.9059 0.0941 0.9649

Indeed, every value of `X`

(the right-most
column) that is less than 0.15 has associated scores (the left and
center columns) of `0`

and `1`

,
while the other values of `X`

have associated scores
of `0.91`

and `0.09`

. The difference
(score `0.09`

instead of the expected `.06`

)
is due to a statistical fluctuation: there are `8`

observations
in `X`

in the range `(.95,1)`

instead
of the expected `5`

observations.

### Edge

The *edge* is the weighted mean value of the classification
margin. The weights are the class probabilities in
`tree`

`.Prior`

. If you supply weights in the
`weights`

name-value pair, those weights are normalized to sum
to the prior probabilities in the respective classes, and are then used to compute
the weighted average.

## Extended Capabilities

### Tall Arrays

Calculate with arrays that have more rows than fit in memory.

This function fully supports tall arrays. For more information, see Tall Arrays.

### GPU Arrays

Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Usage notes and limitations:

`edge`

executes on a GPU in these cases only:The input argument

`X`

is a`gpuArray`

.The input argument

`tbl`

contains`gpuArray`

predictor variables.The input argument

`mdl`

was fitted with GPU array input arguments.

If the classification tree model was trained with surrogate splits, these limitations apply:

You cannot specify the input argument

`X`

as a`gpuArray`

.You cannot specify the input argument

`tbl`

as a table containing`gpuArray`

elements.

For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

# Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)