上海交通大学学报 ›› 2022, Vol. 56 ›› Issue (12): 1638-1648.doi: 10.16183/j.cnki.jsjtu.2021.292

所属专题: 《上海交通大学学报》2022年“电子信息与电气工程”专题

• 电子信息与电气工程 • 上一篇    下一篇

基于记忆传递旗鱼优化的K均值混合迭代聚类

黄鹤a,b, 熊武a,b, 吴琨a,b, 王会峰b, 茹锋a,b, 王珺a()   

  1. a.长安大学 西安市智慧高速公路信息融合与控制重点实验室, 西安 710064
    b.长安大学 电子与控制工程学院, 西安 710064
  • 收稿日期:2021-08-05 出版日期:2022-12-28 发布日期:2023-01-05
  • 通讯作者: 王珺 E-mail:jwang@nwu.edu.cn.
  • 作者简介:黄 鹤(1979-),男,河南省南阳市人,教授,博士生导师,主要从事信息融合、图像处理等研究.
  • 基金资助:
    国家重点研发计划(2018YFB1600600);陕西省重点研发计划(2021SF-483);陕西省自然科学基础研究计划(2021JM-184);长安大学中央高校基本科研业务费专项资金项目(300102329401);长安大学中央高校基本科研业务费专项资金项目(300102329501);西安市智慧高速公路信息融合与控制重点实验室(长安大学)开放基金项目(300102321502)

K-means Hybrid Iterative Clustering Based on Memory Transfer Sailfish Optimization

HUANG Hea,b, XIONG Wua,b, WU Kuna,b, WANG Huifengb, RU Fenga,b, WANG Juna()   

  1. b. Xi’an Key Laboratory of Intelligent Expressway Information Fusion and Control, Chang’an University, Xi’an 710064, China
    b. School of Electronics and Control Engineering, Chang’an University, Xi’an 710064, China
  • Received:2021-08-05 Online:2022-12-28 Published:2023-01-05
  • Contact: WANG Jun E-mail:jwang@nwu.edu.cn.

摘要:

针对现有K均值聚类(KMC)算法受初始化影响较大,随机产生的聚类中心极易使聚类结果陷入局部最优而停止迭代,导致聚类精度低、鲁棒性差的问题,提出一种基于记忆传递旗鱼优化的K均值混合迭代聚类(MTSFO-HIKMC)算法.首先,借鉴已有改进思路,引入最大最小距离积来初始化KMC聚类中心,避免随机初始化带来的不确定性;同时,在迭代过程中,令当前最优解在局部进行自适应记忆传递修正,解决由于旗鱼算法搜索路径单一带来的全局寻优能力差和搜索精度不足的问题.利用Iris、Seeds、CMC和Wine国际标准数据集对MTSFO-HIKMC、旗鱼优化的K均值混合迭代聚类 (SFO-KMC)算法、 引入改进飞蛾扑火的K均值交叉迭代聚类(IMFO-KMC)算法、KMC算法和模糊C均值(FCM)算法进行比较测试,从得到的收敛曲线和性能指标可知,所提出的MTSFO-HIKMC算法相较于IMFO-KMC算法具有更快的收敛速度;在高维度空间较IMFO-KMC算法具有更高的搜索精度;相较于KMC和FCM算法具有更高的搜索精度;相比SFO-KMC算法在收敛速度和搜索精度方面都有明显提升,在高维数据集方面尤其明显.

关键词: 旗鱼算法, 自适应记忆传递修正策略, K均值聚类, 最大最小距离积法, UCI标准数据集

Abstract:

Aimed at the problem that the existing K-means clustering (KMC) algorithm is greatly affected by initialization, and the randomly generated clustering center can easily make the clustering result fall into local optimum and stop iterating, resulting in low clustering accuracy and poor robustness, a K-means hybrid iterative clustering algorithm based on memory transfer sailfish optimization (MTSFO-HIKMC) is proposed. First, learning from the existing improvement ideas, the maximum and minimum distance product is introduced to initialize the KMC cluster center, to avoid the uncertainty caused by random initialization. At the same time, in the iterative process, the current optimal solution is made to locally perform adaptive memory transfer correction to solve the problem of poor global optimization ability and insufficient search accuracy caused by the single search path of the sailfish algorithm. Using the Iris, Seeds, CMC and Wine international standard data sets, the MTSFO-HIKMC, the sailfish optimized K-means hybrid iterative clustering (SFO-KMC) algorithm, the introduction of the improved Moth-to-fire K-means cross iterative clustering (IMFO-KMC) algorithm, the KMC algorithm, and the fuzzy C-means (FCM) algorithm are compared and tested. From the obtained convergence curves and performance indicators, it can be seen that the MTSFO-HIKMC algorithm proposed in this paper has a faster convergence speed than IMFO-KMC. Compared with the IMFO-KMC algorithm, the dimensional space has a higher search accuracy. Compared with the KMC algorithm and FCM, it has a higher search accuracy. Compared with the SFO-KMC algorithm, its convergence speed and search accuracy are significantly improved, especially in high-dimensional data sets.

Key words: sailfish algorithm, adaptive memory transfer correction strategy, K-means clustering (KMC), maximum and minimum distance product, UCI standard data set

中图分类号: