基于距离置信度分数的多模态融合分类网络
收稿日期: 2020-06-18
网络出版日期: 2022-01-21
基金资助
人工智能医学信息系统软件临床试验技术规范资助项目(2019YFC0118805)
Multimodal Fusion Classification Network Based on Distance Confidence Score
Received date: 2020-06-18
Online published: 2022-01-21
郑德重, 杨媛媛, 黄浩哲, 谢哲, 李文涛 . 基于距离置信度分数的多模态融合分类网络[J]. 上海交通大学学报, 2022 , 56(1) : 89 -100 . DOI: 10.16183/j.cnki.jsjtu.2020.186
Multimodal data modeling can effectively overcome the problem of insufficient information in a single mode and can greatly improve the performance of model. However, not much progress has been made in quantifying the confidence of neural network models, especially for multimodal fusion models. This paper proposes a method based on embedding, which calculates the local density estimation in the embedding space by calculating the distance between samples, and then calculates the confidence score of the model. The proposed method is scalable and can be used not only for a single modal model, but also for the confidence measurement of multimodal fusion model. In addition, it can also evaluate and quantify the influences of different modal data on the multimodal fusion model.
Key words: confidence; multimodal fusion; neural network
[1] | MANDELBAUM A, WEINSHALL D. Distance-based confidence score for neural network classifiers[EB/OL]. (2017-09-02) [2020-04-16]. https://www.researchgate.net/publication/320097900_Distance-based_Confidence_Score_for_Neural_Network_Classifiers. |
[2] | SHAKED A, WOLF L. Improved stereo matching with constant highway networks and reflective confidence learning[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA, 2017: 6901-6910. |
[3] | ZHANG Y B, ZHANG Z F, MIAO D Q, et al. Three-way enhanced convolutional neural networks for sentence-level sentiment classification[J]. Information Sciences, 2019, 477:55-64. |
[4] | DEL-AGUA M Á, GIMÉNEZ A, SANCHIS A, et al. Speaker-adapted confidence measures for ASR using deep bidirectional recurrent neural networks[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2018, 26(7):1198-1206. |
[5] | NADEEM U, BENNAMOUN M, SOHEL F, et al. Learning-based confidence estimation for multi-modal classifier fusion[C]// International Conference on Neural Information Processing. Sydney, NSW, Australia: ICONIP, 2019: 299-312. |
[6] | BERTHON A, HAN B, NIU G, et al. Confidence scores make instance-dependent label-noise learning possible[EB/OL]. (2020-01-12)[2020-04-16]. https://www.researchgate.net/publication/338570353_Confidence_Scores_Make_Instance-dependent_Label-noise_Learning_Possible. |
[7] | SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[EB/OL].(2013-12-12)[2020-04-16]. https://www.researchgate.net/publication/259440613_Intriguing_properties_of_neural_networks. |
[8] | GAL Y, GHAHRAMANI Z. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning[J]. Proceedings of Machine Learning Research, 2016, 48:1050-1059. |
[9] | GUO C, PLEISS G, SUN Y, et al. On calibration of modern neural networks[J]. Proceedings of Machine Learning Research, 2017, 70:1321-1330. |
[10] | DENG B, JIA S, SHI D M. Deep metric learning-based feature embedding for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(2):1422-1435. |
[11] | CAO S C, WANG X F, KITANI K M. Learnable embedding space for efficient neural architecture compression[EB/OL].(2019-02-12) [2020-04-16]. https://www.researchgate.net/publication/330845079_Learnable_Embedding_Space_for_Efficient_Neural_Architecture_Compression. |
[12] | HU H X, ZHOU G T, DENG Z W, et al. Learning structured inference neural networks with label relations[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016: 2960-2968. |
[13] | OZDEMIR O, WOODWARD B, BERLIN A A. Propagating uncertainty in multi-stage Bayesian convolutional neural networks with application to pulmonary nodule detection[EB/OL].(2019-02-12)[2020-04-16]. https://www.researchgate.net/publication/321511200_Propagating_Uncertainty_in_Multi-Stage_Bayesian_Convolutional_Neural_Networks_with_Application_to_Pulmonary_Nodule_Detection. |
[14] | NEAL R M. Bayesian learning for neural networks[M]. New York, NY, USA: Springer, 1996. |
[15] | 徐磊, 李向阳, 黄湘岳. 基于贝叶斯网络的非常规突发事件灾情评估[J]. 上海交通大学学报, 2013, 47(5):846-850. |
[15] | XU Lei, LI Xiangyang, HUANG Xiangyue. Unconventional emergency assessment based on Bayesian network[J]. Journal of Shanghai Jiao Tong University, 2013, 47(5):846-850. |
[16] | GAL Y, GHAHRAMANI Z. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning[C]// Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York, NY, USA:SIGCHI, 2016,: 1050-1059. |
[17] | LAKSHMINARAYANAN B, PRITZEL A, BLUNDELL C. Simple andscalable predictive uncertainty estimation using deep ensembles[EB/OL].(2016-12-12)[2020-04-16]. https://www.researchgate.net/publication/311430201_Simple_and_Scalable_Predictive_Uncertainty_Estimation_using_Deep_Ensembles. |
[18] | DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009: 10836047. |
[19] | NOJAVANASGHARI B, GOPINATH D, KOUSHIK J, et al. Deep multimodal fusion for persuasiveness prediction[C]// Proceedings of the 18th ACM International Conference on Multimodal Interaction. Tokyo, Japan: SIGCHI, 2016: 284-288. |
[20] | BALTRUŠAITIS T, AHUJA C, MORENCY L P. Multimodal machine learning: A survey and taxonomy[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(2):423-443. |
[21] | ZARAGOZA H, BUC D. Confidence measures for neural network classifiers[EB/OL]. [2020-04-16]. https://www.semanticscholar.org/paper/Confidence-Measures-for-Neural-Network-Classifiers-Zaragoza-Buc/f19e6e8a06cba5fc8cf234881419de9193bba9d0. |
[22] | GHOSH P, DAVIS L S. Understanding center loss based network for image retrieval with few training data[C]// Computer Vision-European Conference on Computer Vision 2018 Workshops. Munich, Germany: Springer, 2018: 717-722. |
[23] | WEN Y D, ZHANG K P, LI Z F, et al. A discriminative feature learning approach for deep face recognition[C]// Computer Vision-European Conference on Computer Vision 2016 Workshops. Amsterdam, The Netherlands: Springer, 2016: 499-515. |
[24] | SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]// Proceedings of the 31 st International Conference on Neural Information Processing Systems. Red Hook, NY, USA: SIGCHI, 2017: 4080-4090. |
[25] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL].(2017-06-12)[2020-04-16]. https://arxiv.org/abs/1706.03762. |
[26] | WANG F, JIANG M Q, QIAN C, et al. Residual attention network for image classification[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA: IEEE, 2017: 6450-6458. |
[27] | ZHANG H, GOODFELLOW I, METAXAS D, et al. Self-attention generative adversarial networks[J]. Proceedings of Machine Learning Research, 2019, 97:7354-7363. |
[28] | RAMACHANDRAN P, PARMAR N, VASWANI A, et al. Stand-alone self-attention in vision models[EB/OL].(2019-06-28)[2020-04-16]. https://www.researchgate.net/publication/333815334_Stand-Alone_Self-Attention_in_Vision_Models. |
[29] | WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]// Computer Vision-European Conference on Computer Vision 2018 Workshops. Munich, Germany: Springer, 2018: 3-19. |
[30] | WILKS D S. Sampling distributions of the Brier score and Brier skill score under serial dependence[J]. Quarterly Journal of the Royal Meteorological Society, 2010, 136(653):2109-2118. |
/
〈 |
|
〉 |