Journal of Shanghai Jiao Tong University ›› 2021, Vol. 55 ›› Issue (2): 131-140.doi: 10.16183/j.cnki.jsjtu.2020.082

Special Issue: 《上海交通大学学报》2021年12期专题汇总专辑 《上海交通大学学报》2021年“自动化技术、计算机技术”专题

Previous Articles     Next Articles

Data Splitting Method of Distance Metric Learning Based on Gaussian Mixed Model

ZHENG Dezhong1,2, YANG Yuanyuan1, XIE Zhe1,2, NI Yangfan1,2, LI Wentao3()   

  1. 1.Laboratory for Medical Imaging Informatics, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200080, China
    2.University of Chinese Academy of Sciences, Beijing 100049, China
    3.Fudan University Shanghai Cancer Center, Shanghai 200032, China
  • Received:2020-03-24 Online:2021-02-01 Published:2021-03-03
  • Contact: LI Wentao


Aimed at the problem of instability and deviation of multiple training model in limited samples, this paper proposes a method of distance metric learning based on the Gaussian mixture model, which can solve this problem more reasonably by dividing the dataset. Distance metric learning relies on the excellent feature extraction capabilities of deep neural networks to embed the original data into the new metric space. Then, based on the deep features, the Gaussian mixture model is used to cluster the analyzer and estimate the sample distribution in this new metric space. Finally, according to the characteristics of sample distribution, stratified sampling is used to reasonably divide the data. The research shows that the method proposed can better understand the characteristics of data distribution and obtain a more reasonable data division, thereby improving the accuracy and generalization of the model.

Key words: artificial intelligence training, dataset division, deep neural networks, Gaussian mixture model

CLC Number: