Journal of Shanghai Jiao Tong University ›› 2021, Vol. 55 ›› Issue (5): 598-606.doi: 10.16183/j.cnki.jsjtu.2020.011

Special Issue: 《上海交通大学学报》2021年12期专题汇总专辑 《上海交通大学学报》2021年“自动化技术、计算机技术”专题

Previous Articles     Next Articles

Collective Data Anomaly Detection Based on Reverse k-Nearest Neighbor Filtering

WU Jin’e1, WANG Ruoyu2, DUAN Qianqian1(), LI Guoqiang1,2, JÜ Changjiang2   

  1. 1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201600, China
    2. School of Software, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2020-01-08 Online:2021-05-28 Published:2021-06-01
  • Contact: DUAN Qianqian E-mail:dqq1019@163.com

Abstract:

Aimed at the problem of group data anomaly detection with no data labels, a k-nearest neighbor (kNN) algorithm is proposed to detect group data anomalies in the unsupervised mode. In order to reduce false negatives and false positives caused by the mutual interference between abnormal and normal values, a reverse k-nearest neighbor (RkNN) method is proposed to filter the abnormal group data in reverse. First, the RkNN algorithm uses statistical distance as the similarity measure between different groups of data. Then, the anomaly scores of each group and the initial abnormality are obtained by using the kNN algorithm. Finally, the initial abnormality is filtered by using the RkNN method. The experiment results show that the algorithm proposed can not only effectively reduce the false negatives and false positives, but also has a high anomaly detection rate and good stability.

Key words: abnormal detection, unsupervised, k-nearest neighbor (kNN), reverse k-nearest neighbor (RkNN), statistical distance

CLC Number: