基于隐变量后验生成对抗网络的不平衡学习

何新林, 戚宗锋, 李建勋

doi:10.16183/j.cnki.jsjtu.2019.264

上海交通大学学报 >

2021 , Vol. 55 >Issue 5: 557 - 565

DOI: https://doi.org/10.16183/j.cnki.jsjtu.2019.264

基于隐变量后验生成对抗网络的不平衡学习

展开

1.上海交通大学电子信息与电气工程学院, 上海 200240
2.电子信息系统复杂电磁环境效应国家重点实验室, 河南洛阳 471003

何新林(1992-),男,湖南省常德市人,硕士生,主要研究方向为数据挖掘.

收稿日期: 2019-09-16

网络出版日期: 2021-06-01

基金资助

国家重点研发计划(2020YFC1512203);电子信息系统复杂电磁环境效应(CEMEE)国家重点实验室基金(2019K0302A);民用飞机专项(MJ-2017-S-38);国家自然科学基金(61673265)

收起

Unbalanced Learning of Generative Adversarial Network Based on Latent Posterior

Expand

1. School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
2. State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System, Luoyang 471003, Henan, China

Received date: 2019-09-16

Online published: 2021-06-01

Fold

摘要

针对现有不平衡分类问题中过采样方法不能充分利用数据概率密度分布的问题,提出了一种基于隐变量后验生成对抗网络的过采样(LGOS)算法.该方法利用变分自编码求取隐变量的近似后验分布,生成器能有效估计数据真实概率分布,在隐空间中采样克服了生成对抗网络采样过程的随机性,并引入边缘分布自适应损失和条件分布自适应损失提升生成数据质量.此外,将生成样本当作源领域样本放入迁移学习框架中,提出了改进的基于实例的迁移学习(TrWSBoost)分类算法,引入了权重缩放因子,有效解决了源领域样本权重收敛过快、学习不充分的问题.实验结果表明,提出的方法在分类问题各指标上的表现明显优于现有方法.

关键词： 不平衡分类; 生成对抗网络; 隐变量; 迁移学习

本文引用格式

何新林, 戚宗锋, 李建勋 . 基于隐变量后验生成对抗网络的不平衡学习[J]. 上海交通大学学报, 2021 , 55(5) : 557 -565 . DOI: 10.16183/j.cnki.jsjtu.2019.264

Abstract

Based on the problem that the oversampling method in the existing unbalanced classification problem cannot fully utilize the data probability density distribution, a method named latent posterior based generative adversarial network for oversampling (LGOS) was proposed. This method used variational auto-encoder to obtain the approximate posterior distribution of latent variable and generation network could effectively estimate the true probability distribution function of the data. The sampling in the latent space could overcome the randomness of generative adversarial network. The marginal distribution adaptive loss and the conditional distribution adaptive loss were introduced to improve the quality of generated data. Besides, the generated samples as source domain samples were put into the transfer learning framework, the classification algorithm of transfer learning for boosting with weight scaling (TrWSBoost) was proposed, and the weight scaling factor was introduced, which effectively solved the problem that the weight of source domain samples converge too fast and lead to insufficient learning. The experimental results show that the proposed method is superior to the existing oversampling method in the performance of common metrics.

Key words： unbalanced classification; generative adversarial network; latent variable; transfer learning

参考文献

[1]	FOTOUHI S, ASADI S, KATTAN M W. A comprehensive data level analysis for cancer diagnosis on imbalanced data[J]. Journal of Biomedical Informa-tics, 2019, 90:103089.
[2]	NAMVAR A, SIAMI M, RABHI F, et al. Credit risk prediction in an imbalanced social lending environment[J]. International Journal of Computational Intelligence Systems, 2018, 11(1):925-935.
[3]	SOLEYMANI R, GRANGER E, FUMERA G. Progressive boosting for class imbalance and its application to face re-identification[J]. Expert Systems With Applications, 2018, 101:271-291.
[4]	LEE T, LEE K B, KIM C O. Performance of machine learning algorithms for class-imbalanced process fault detection problems[J]. IEEE Transactions on Semiconductor Manufacturing, 2016, 29(4):436-445.
[5]	CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: Synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, 16:321-357.
[6]	HAN H, WANG W Y, MAO B H. Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning [C]//International Conference on Intelligent Computing. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005: 878-887.
[7]	HE H B, BAI Y, GARCIA E A, et al. ADASYN: Adaptive synthetic sampling approach for imbalanced learning [C]//2008 IEEE International Joint Conference on Neural Networks. Piscataway, NJ, USA: IEEE, 2008: 1322-1328.
[8]	BARUA S, ISLAM M M, YAO X, et al. MWMOTE: Majority weighted minority oversampling technique for imbalanced data set learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(2):405-425.
[9]	DOUZAS G, BACAO F. Effective data generation for imbalanced learning using conditional generative adversarial networks[J]. Expert Systems With Applications, 2018, 91:464-471.
[10]	HE H B, GARCIA E A. Learning from imbalanced data[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9):1263-1284.
[11]	SUN Y M, KAMEL M S, WONG A K C, et al. Cost-sensitive boosting for classification of imba-lanced data[J]. Pattern Recognition, 2007, 40(12):3358-3378.
[12]	CHAWLA N V, LAZAREVIC A, HALL L O, et al. SMOTEBoost: Improving prediction of the minority class in boosting [C]//European Conference on Principles of Data Mining and Knowledge Discovery. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003: 107-119.
[13]	CHEN S, HE H B, GARCIA E A. RAMOBoost: Ranked minority oversampling in boosting[J]. IEEE Transactions on Neural Networks, 2010, 21(10):1624-1642.
[14]	ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks [C]//2017 IEEE International Conference on Computer Vision. Piscataway, NJ, USA: IEEE, 2017: 2242-2251.
[15]	ZHANG H, XU T, LI H S, et al. StackGAN: Realistic image synjournal with stacked generative adversarial networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(8):1947-1962.
[16]	GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2. Cambridge, MA, USA: MIT Press, 2014: 2672-2680.
[17]	PAN S J, TSANG I W, KWOK J T, et al. Domain adaptation via transfer component analysis[J]. IEEE Transactions on Neural Networks, 2011, 22(2):199-210.
[18]	LONG M S, WANG J M, DING G G, et al. Transfer feature learning with joint distribution adaptation [C]//2013 IEEE International Conference on Computer Vision. Piscataway, NJ, USA: IEEE, 2013: 2200-2207.
[19]	LONG M S, ZHU H, WANG J M, et al. Deep transfer learning with joint adaptation networks [C]//ICML'17: Proceedings of the 34th International Conference on Machine Learning-Volume 70. New York, NY, USA: ACM, 2017: 2208-2217.
[20]	DAI W Y, YANG Q, XUE G R, et al. Boosting for transfer learning[C]//Proceedings of the 24th International Conference on Machine Learning-ICML '07. New York: ACM Press, 2007: 193-200.
[21]	王胜涛. 基于迁移过采样的类别不平衡学习算法研究[D]. 南京: 东南大学, 2017.
[21]	WANG Shengtao. Research on transfer-sampling based method for class-imbalance learning[D]. Nanjing: Southeast University, 2017.
[22]	么素素, 王宝亮, 侯永宏. 绝对不平衡样本分类的集成迁移学习算法[J]. 计算机科学与探索, 2018, 12(7):1145-1153.
[22]	YAO Susu, WANG Baoliang, HOU Yonghong. Ensemble transfer learning algorithm for absolute imbalanced data classification[J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(7):1145-1153.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献