Find cluster centers using subtractive clustering
Load data set.
Find cluster centers using the same range of influence for all dimensions.
C = subclust(clusterdemo,0.6);
Each row of
C contains one cluster center.
C = 3×3 0.5779 0.2355 0.5133 0.7797 0.8191 0.1801 0.1959 0.6228 0.8363
Load data set.
Define minimum and maximum normalization bounds for each data dimension. Use the same bounds for each dimension.
dataScale = [-0.2 -0.2 -0.2; 1.2 1.2 1.2];
Find cluster centers.
C = subclust(clusterdemo,0.5,'DataScale',dataScale);
Load data set.
Specify the following clustering options:
Squash factor of
2.0 - Only find clusters that are far from each other.
0.8 - Only accept data points with a strong potential for being cluster centers.
Reject ratio of
0.7 - Reject data points if they do not have a strong potential for being cluster centers.
Verbosity flag of
0 - Do not print progress information to the command window.
options = [2.0 0.8 0.7 0];
Find cluster centers, using a different range of influence for each dimension and the specified options.
C = subclust(clusterdemo,[0.5 0.25 0.3],'Options',options);
Load data set.
Cluster data, returning cluster sigma values,
[C,S] = subclust(clusterdemo,0.5);
Cluster sigma values indicate the range of influence of the computed cluster centers in each data dimension.
data— Data set to be clustered
Data to be clustered, specified as an M-by-N array, where M is the number of data points and N is the number of data dimensions.
clusterInfluenceRange— Range of influence of the cluster center
1] | vector
Range of influence of the cluster center for each input and
output assuming the data falls within a unit hyperbox, specified as
the comma-separated pair consisting of
of the following:
Scalar value in the range [
1] — Use the same influence range for all inputs
Vector — Use different influence ranges for each input and output.
Specifying a smaller range of influence usually creates more and smaller data clusters, producing more fuzzy rules.
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
'DataScale','auto'sets the normalizing factors for the input and output signals using the minimum and maximum values in the data set to be clustered.
Options— Clustering options
Clustering options, specified as the comma-separated pair consisting of
'Options' and a vector with the following
Options(1)— Squash factor
1.25(default) | positive scalar
Squash factor for scaling the range of influence of cluster centers, specified as a positive scalar. A smaller squash factor reduces the potential for outlying points to be considered as part of a cluster, which usually creates more and smaller data clusters.
Options(2)— Acceptance ratio
0.5(default) | scalar value in the range [
Acceptance ratio, defined as a fraction of the potential of
the first cluster center, above which another data point is accepted
as a cluster center, specified as a scalar value in the range [
The acceptance ratio must be greater than the rejection ratio.
Options(3)— Rejection ratio
0.15(default) | scalar value in the range [
Rejection ratio, defined as a fraction of the potential of the
first cluster center, below which another data point is rejected as
a cluster center, specified as a scalar value in the range [
The rejection ratio must be less than acceptance ratio.
Options(4)— Information display flag
Information display flag indicating whether to display progress information during clustering, specified as one of the following:
false — Do not display progress
true — Display progress
centers— Cluster centers
Cluster centers, returned as a J-by-N array, where J is the number of clusters and N is the number of data dimensions.
sigma— Range of influence of cluster centers
Range of influence of cluster centers for each data dimension,
returned as an N-element row vector. All cluster
centers have the same set of
To generate a fuzzy inference system using subtractive
clustering, use the
genfis command. For example,
suppose you cluster your data using the following syntax:
C = subclust(data,clusterInfluenceRange,'DataScale',dataScale,'Options',options);
where the first
M columns of
to input variables, and the remaining columns correspond to output
You can generate a fuzzy system using the same training data and subtractive clustering configuration. To do so:
Configure clustering options.
opt = genfisOptions('SubtractiveClustering'); opt.ClusterInfluenceRange = clusterInfluenceRange; opt.DataScale = dataScale; opt.SquashFactor = options(1); opt.AcceptRatio = options(2); opt.RejectRatio = options(3); opt.Verbose = options(4);
Extract input and output variable data.
inputData = data(:,1:M); outputData = data(:,M+1:end);
Generate FIS structure.
fis = genfis(inputData,outputData,opt);
The fuzzy system,
fis, contains one fuzzy
rule for each cluster, and each input and output variable has one
membership function per cluster. You can generate only Sugeno fuzzy
systems using subtractive clustering. For more information, see
Subtractive clustering assumes that each data point is a potential cluster center. The algorithm does the following:
Calculate the likelihood that each data point would define a cluster center, based on the density of surrounding data points.
Choose the data point with the highest potential to be the first cluster center.
Remove all data points near the first cluster center.
The vicinity is determined using
Choose the remaining point with the highest potential as the next cluster center.
Repeat steps 3 and 4 until all the data is within the influence range of a cluster center.
The subtractive clustering method is an extension of the mountain clustering method proposed in .
 Chiu, S., "Fuzzy Model Identification Based on Cluster Estimation," Journal of Intelligent & Fuzzy Systems, Vol. 2, No. 3, Sept. 1994.
 Yager, R. and D. Filev, "Generation of Fuzzy Rules by Mountain Clustering," Journal of Intelligent & Fuzzy Systems, Vol. 2, No. 3, pp. 209-219, 1994.