A hybrid data clustering algorithm based on improved krill herd algorithm and KHM clustering
【Abstract】 K-means clustering is sensitive to initial clustering centroids and prone to fall into local optimum. A hybrid data clustering algorithm based on an improved krill herd (KH) algorithm and K-harmonic means clustering is proposed in order to solve the problem. Firstly, an improved KH algorithm with Lévy flight and crossover operator is proposed to avoid local best and low search efficiency of the KH algorithm. After each updating of standard krill herd position, a new position updating method is adopted to further improve the search ability of the population. At the same time, Lévy flight and crossover operators are used alternately to carry out greedy search for the current herd position to enhance the global search ability of the algorithm. The experimental results of 20 benchmark functions show that the improved algorithm is not easy to fall into the local best, which can find the global optimal solution with smaller iteration number and ensure the stability of the algorithm. Then, the improved KH algorithm and the K-harmonic means clustering algorithm are combined to solve the problem of data clustering. The worst individual is replaced by the best individual or the new individual generated by the K-harmonic means algorithm after each iteration. The test results of five real data sets from UCI show that the integrated clustering algorithm overcomes the defect that K-means is sensitive to the initial clustering centroid with strong global convergence.
【Keywords】 krill herd algorithm; Lévy flight; crossover operator; K-harmonic means clustering; hybrid clustering;
 Tan Pang-ning, Steinbach Michael, Kumar Vipin, et al. Introduction to data mining [M]. Beijing: Posts & Telecom Press, 2011: 306.
 Krista RizmanŽalik. An efficient k-means clustering algorithm [J]. Pattern Recognition Letters, 2008, 29(9): 1385-1391.
 Zhang Bin, Hsu Meichun, Dayal Umeshwar. K-harmonic means—A data clustering algorithm [R]. Palo Alto: Hewlett-Packard Laboratories, 1999.
 Carvalho V O. Combining K-Means and K-Harmonic with fish school search algorithm for data clustering task on graphics processing units [J]. Applied Soft Computing, 2016, 41: 290–304.
 Zhou Z, Zhao X, Zhu S. K-harmonic means clustering algorithm using feature weighting for color image segmentation [J]. Multimedia Tools&Applications, 2018, 77: 15139–15160.
 Khanmohammadi S, Adibeig N, Shanehbandy S. An improved overlapping k-means clustering method for medical applications [J]. Expert Systems with Applications, 2017, 67: 12–18.
Wu B, Wang D Z, Wu X H, et al. Possibilistic fuzzy K-harmonic means clustering of fourier transform infrared spectra of tea [J]. Spectroscopy and Spectral Analysis, 2018, 38(3): 745–749 (in Chinese).
 Mahi H, Farhi N, Labed K. Remotely sensed data clustering using K-harmonic means algorithm and cluster validity index [J]. IFIP Advances in Information and Communication Technology, 2018, 456: 105–116.
 Yeh W C, Lai C M, Chang K H. A novel hybrid clustering approach based on K-harmonic means using robust design [J]. Neurocomputing, 2016, 173: 1720–1732.
 Güngör Z, Ünler A. K-harmonic means data clustering with simulated annealing heuristic [J]. Applied Mathematics&Computation, 2007, 184(2): 199–209.
 Jiang H, Yi S, Li J, et al. Ant clustering algorithm with K-harmonic means clustering [J]. Expert Systems with Applications, 2010, 37(12): 8679–8684.
 Yang F, Sun T, Zhang C. An efficient hybrid data clustering method based on K-harmonic means and particle swarm optimization [J]. Expert Systems with Applications, 2009, 36(6): 9847–9852.
 Bouyer A, Hatamlou A. An efficient hybrid clustering method based on improved cuckoo optimization and modified particle swarm optimization algorithms [J]. Applied Soft Computing, 2018, 67: 172–182.
 Gandomi A H, Alavi A H. Krill herd: A new bio-inspired optimization algorithm [J]. Communications in Nonlinear Science&Numerical Simulation, 2012, 17(12): 4831–4845.
 Servet M Kiran. Particle swarm optimization with a new update mechanism [J]. Applied Soft Computing, 2017, 60: 670–678.
 Chechkin A V, Metzler R, Klafter J, et al. Introduction to the theory of Lévy flights [M]. Anomalous Transport: Foundations and Applications, 2008: 1–41.
 Yang X S, Suash Deb. Cuckoo search via Lévy flights[C]. World Congress on Nature&Biologically Inspired Computing. Coimbatore: IEEE, 2009: 210–214.
Wang X W, Yan Y X, Gu X S. Welding robot path planning based on Lévy-PSO [J]. Control and Decision, 2017, 32(2): 373–377 (in Chinese).
Zhang X M, Wang X, Tu Q, et al. Particle swarm optimization algorithm based on combining global-best operator and Lévy flight [J]. Journal of University of Electronic Science and Technology of China, 2018, 47(3): 103–111 (in Chinese).
 Tawhid M A, Ali A F. Simplex particle swarm optimization with arithmetical crossover for solving global optimization problems [J]. Opsearch, 2016, 53: 705–740.
 Chen Y, Li L, Xiao J, et al. Particle swarm optimizer with crossover operation [J]. Engineering Applications of Artificial Intelligence, 2018, 70: 159–169.
 Eberhart R, Kennedy J. A new optimizer using particle swarm theory[C]. Proceedings of the 6th International Symposium on Micro Machine and Human Science. Nagoya: IEEE, 1995: 39–43.
 Mirjalili S. SCA: A sine cosine algorithm for solving optimization problems [J]. Knowledge-Based Systems, 2016, 96: 120–133.
 Mirjalili S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm [J]. Knowledge-Based Systems, 2015, 89: 228–249.