Validity Measurement A Comparison Study between Various Fuzzy Clustering Algorithms

© 2011 Jordan Journal of Mechanical and Industrial Engineering. All rights reserved - Volume 5, Number 4 ISSN 1995-6665 337 below a threshold. Although this method is simple and effective, the computation grows exponentially with the dimension of the problem because the mountain function has to be evaluated at each grid point. Chiu [8] proposed an extension of Yager and Filev’s mountain method, called subtractive clustering. This method solves the computational problem associated with mountain method. It uses data points as the candidates for cluster centers, instead of grid points as in mountain clustering. The computation for this technique is now proportional to the problem size instead of the problem dimension. The problem with this method is that sometimes the actual cluster centres are not necessarily located at one of the data points. However, this method provides a good approximation, especially with the reduced computation that this method offers. It also eliminates the need to specify a grid resolution, in which tradeoffs between accuracy and computational complexity must be considered. The subtractive clustering method also extends the mountain method’s criterion for accepting and rejecting cluster centres. The parameters of the subtractive clustering are χ i is the normalized data vector of both input and output dimensions defined as: } min{ } max{ } min{ 1 1 i i i i i x x x x x − − = , n is total number of data vectors, r a is hyper sphere cluster radius in data space, r b is the hyper sphere penalty radius in data space, P i is the potential value of data vector i, η is the squash factor = b a r r , ε is the accept ratio, and ε is the reject ratio. The subtractive clustering method works as follows. Consider a collection of n data points { } xn x x x ,..., 3 , 2 , 1 in an M dimensional space. Without loss of generality, the data points are assumed to have been normalized in each dimension so that they are bounded by a unit hypercube. Each data point is considered as a potential cluster center. The potential of data point x i is defined as: ∑ = − − = n j r x x i a j i e P 1 4 2 2 4 Where, the symbol || . || denotes the Euclidean distance, and r a is a positive constant. Thus, the measure of the potential for a data point is a function of its distances to all other data points. A data point with many neighboring data points will have a high potential value. The constant r a is effectively the radius defining a neighborhood; data points outside this radius have little influence on the potential. After the potential of every data point has been computed, we select the data point with the highest potential as the first cluster center. Let x 1 be the location of the first cluster center and P 1 be its potential value. We then revise the potential of each data point x i by the formula: 2 2 1 4 1 b i r x x i i e p p p − − − = 5 where r b is a positive constant. Thus, we subtract an amount of potential from each data point as a function of its distance from the first cluster center. The data points near the first cluster center will have greatly reduced potential, and therefore will unlikely be selected as the next cluster center. The constant r b is effectively the radius defining the neighborhood which will have measurable reductions in potential. When the potential of all data points has been revised, we select the data point with the highest remaining potential as the second cluster center. This process continues until a sufficient number of clusters are obtained. In addition to these criterions for ending the clustering process are criteria for accepting and rejecting cluster centers that help avoid marginal cluster centers. Figure 2 demonstrates the procedure followed in determining the best clustering results output from the subtractive clustering algorithm. Figure 2: Subtractive clustering algorithm procedure.

3. Validity Measurement

This section presents the widely accepted three validity indices: Bezdek, Dave’s, and Xie-Beni. These indices are used to evaluate the performance of each algorithm. Validity measures are scalar indices that assess the goodness of the obtained partition. Clustering algorithms generally aim to locate well separated and compact clusters. When the number of clusters is chosen equal to the number of groups that actually exist in the data, it can be expected that the clustering algorithm will identify them © 2011 Jordan Journal of Mechanical and Industrial Engineering. All rights reserved - Volume 5, Number 4 ISSN 1995-6665 338 correctly. When this is not the case, misclassifications appear, and the clusters are not likely to be well separated and compact. Hence, most cluster validity measures are designed to quantify the separation and the compactness of the clusters. However, as Bezdek [6] pointed out that the concept of cluster validity is open to interpretation and can be formulated in different ways. Consequently, many validity measures have been introduced in the literature, Bezdek [6], Dave’s [9], Gath and Geva [10], Pal and Bezdek [11]. 3.1. Bezdek index : Bezdek [6] proposed an index that is sum of the internal products for all the member ship values assigned to each point inside the U output matrix. Its value is between [1c, 1] and the higher the value the more accurate the clustering results is. The index was defined as: ∑∑ = = = c i n j j i PC u n V 1 1 2 1 6 In general, an optimal cluster number c that produces the best clustering performance for the data is found by solving: PC n c V 1 2 max − ≤ ≤ 7 3.2. Dave’s Validity Index: The Dave’s validity index presented in [9] has been successfully used by many researchers. It has an excellent sense to hidden structures inside the data set. Dave’s [9]’s defined the validity measure as: 1 1 1 PC MPC V c c V − − − = 8 Where V PC is defined in equation 6 This is a modified version of validity measure V PC proposed by Bezdek [6]. Dave’s index usually ranges between 0 and 1. The higher the value of the index is, the better the results are. The study by Wang et al [12] reported that V MPC has successfully discovered the optimal number of clusters in most of the testing benchmarks. 3.3. Xie and Beni Index: Xie and Beni validity index [13] is defined as: 2 2 , 1 1 min , ; j i j i c i N k i k m ik v v c v z u V U Z − − = ∑∑ = = χ 9 This index has been found to perform well in practice. It can be interpreted as the ratio of the total within-group variance and the separation of the cluster centres. The optimal c that produces the best clustering performance for the data is found by solving , , min 1 2 V U Z n c χ − ≤ ≤

4. Results and Discussion