基于锚点加速机制的聚类算法综述

展开
  • 1. 厦门大学萨本栋微纳科学与技术研究所 福建 厦门 361102; 

    2. 集美大学信息工程学院 福建 厦门 361021;

    3. 上海工程技术大学 机械与汽车工程学院, 上海 201620;

    4. 自然资源部第三海洋研究所 福建 厦门 361005

    5.  厦门大学健康医疗大数据国家研究院 福建 厦门 361102

吴沁停(2001—),硕士生,从事模式识别,数据挖掘等研究。
高云龙,副教授,硕士生导师, E-mail:    gaoyl@xmu.edu.cn

网络出版日期: 2025-02-26

基金资助

国家自然科学基金 (42076058)、福建省海洋渔业专项基金(FJHYF-ZH-2023-05)、福建省自然科学基金 ( 2020J01713,2022J01061)、广东省基础与应用基础研究基金(2024A1515011682)

A Review of Cluster Algorithms Based on Anchor Point Acceleration Mechanism

Expand
  • 1. Pen-Tung Sah Institute of Micro-Nano Science and Technology, Xiamen University, Xiamen 361102, Fujian, China; 

    2. School of Information Engineering, Jimei University, Xiamen 361021,  Fujian, China;

    3. School of Mechanical and Automotive Engineering, Shanghai University of Engineering Science, Shanghai 2016202, China; 

    4. Third Institute of Oceanography, Ministry of Natural Resources, Xiamen 361005, Fujian, China;

    5. National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361102, Fujian, China

Online published: 2025-02-26

摘要

随着大数据时代的到来,聚类算法已成为数据挖掘和机器学习的关键。然而,数据规模和维度的指数级增长导致传统聚类方法的时间和空间复杂性不断升级,制约了其实际应用。为了应对这些挑战,锚点加速机制应运而生,它能显著减轻计算负担,从而提高传统聚类算法在大规模数据集上的有效性。本文全面回顾了基于锚点加速机制的聚类算法,探讨了锚点生成和相似性图构建等各种技术,涵盖了利用固定锚点的聚类方法,包括谱聚类、模糊谱聚类、多视图聚类和深度聚类。此外,本文还研究了采用动态锚点的聚类策略,包括多视图和不完全多视图聚类算法。通过综合分析这些情况,本文指出了当前的局限性,并直面新出现的挑战,对未来的发展方向提出了见解,为指导该领域的未来研究和实际应用提供了路线图。这项全面的研究旨在为研究人员和从业人员提供有价值的指导和启发,促进适合当代数据环境的聚类算法的持续创新。

本文引用格式

吴沁停1, 封与哲1, 潘金艳2, 张海峰3, 曹超4, 高云龙1, 5 .

基于锚点加速机制的聚类算法综述[J]. 上海交通大学学报, 0 : 1 . DOI: 10.16183/j.cnki.jsjtu.2024.425

Abstract

With the advent of the big data era, clustering algorithms have become pivotal in data mining and machine learning. However, the exponential growth in data size and dimensionality has resulted in escalating time and space complexities for traditional clustering methods, constraining their practical utility. To address these challenges, the anchor point acceleration mechanism has emerged as a potent approach to significantly mitigate computational burdens, thereby augmenting the effectiveness of conventional clustering algorithms for large-scale datasets. This paper provides a comprehensive review of clustering algorithms leveraging the anchor point acceleration mechanism. It explores various techniques such as anchor point generation and the construction of similarity graphs. The discussion encompasses clustering methodologies utilizing fixed anchor pointsq, encompassing spectral clustering, fuzzy spectral clustering, multi-view clustering, and deep clustering algorithms. Additionally, it investigates clustering strategies employing dynamic anchor points, including multi-view and incomplete multi-view clustering algorithms. By synthesizing and analyzing this landscape, the paper identifies current limitations and confronts emerging challenges. It also offers insights into future avenues for advancement, serving as a roadmap for guiding future research and practical applications in the field. This comprehensive examination aims to provide valuable guidance and inspiration to researchers and practitioners alike, fostering continued innovation in clustering algorithms tailored for contemporary data environments.
文章导航

/