上海交通大学学报(自然版) ›› 2011, Vol. 45 ›› Issue (02): 149-0153.

• 自动化技术、计算机技术 •    下一篇

快速近似聚类算法及其在图像检索中的应用

顾王一,朱林,杨杰   

  1. (上海交通大学 图像处理与模式识别研究所;系统控制与信息处理教育部重点实验室,上海 200240)
  • 收稿日期:2010-02-01 修回日期:1900-01-01 出版日期:2011-02-28 发布日期:2011-02-28

Fast Approximate Clustering Algorithm and Its Application in Image Retrieval

GU Wangyi,ZHU Lin,YANG Jie   

  1. (Institute of Image Processing and Pattern Recognition; Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai Jiaotong University, Shanghai 200240, China)
  • Received:2010-02-01 Revised:1900-01-01 Online:2011-02-28 Published:2011-02-28

摘要: 为了解决传统K均值算法在处理大规模数据时的局限性,在近似K均值算法(AKM)基础之上,利用对聚类中心进行分类的思想,提出了快速近似K均值算法(FAKM).该算法舍去了在AKM聚类结果中只获得少数样本的聚类中心,并充分利用类内样本密集稳定的聚类中心,使得迭代过程中待聚类样本数和类别数逐步减少,达到了提高算法速度及精简聚类结果的目的.将FAKM算法运用于实际的图像检索系统中,实验结果表明,系统在检索准确率、检索时间和聚类时间方面都得到了很好的改善.

关键词: 快速聚类, 近似最近邻, 图像检索, 大规模数据

Abstract: The fast approximate Kmeans algorithm (FAKM) was proposed to solve the limitations of traditional Kmeans algorithm in the large scale database. Based on the approximate Kmeans algorithm (AKM), FAKM classifies the cluster centers according to cluster results. This new algorithm filters out the cluster centers with few samples, and makes good use of those with intensive and stable samples, and thus the number of samples and clusters will reduce in each iteration. Accordingly it can improve the speed of this algorithm and refine the cluster result. Several experimental results in image retrieval system are presented to demonstrate its average advantage over Kmeans and AKM in the clustering time, retrieval time and the robustness capability of retrieval accuracy.

Key words: fast clustering, approximate nearest neighbor, image retrieval, large scale database

中图分类号: