Boosting Unsupervised Domain Adaptation with Soft Pseudo-Label and Curriculum Learning

ZHANG Shengjia(张晟嘉); LIN Tiancheng(林天成); XU Yi*(徐奕)

doi:10.1007/s12204-022-2487-5

Journal of Shanghai Jiaotong University(Science) >

2023 , Vol. 28 >Issue 6: 703 - 716

DOI: https://doi.org/10.1007/s12204-022-2487-5

Computing & Computer Technologies

Boosting Unsupervised Domain Adaptation with Soft Pseudo-Label and Curriculum Learning

ZHANG Shengjia(张晟嘉) ,
LIN Tiancheng(林天成) ,
XU Yi*(徐奕)

Expand

(School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; Shanghai Key Laboratory of Digital Media Processing and Transmission, Shanghai 200240, China)

Accepted date: 2021-07-23

Online published: 2023-12-04

Fold

Abstract

By leveraging data from a fully labeled source domain, unsupervised domain adaptation (UDA) improves classification performance on an unlabeled target domain through explicit discrepancy minimization of data distribution or adversarial learning. As an enhancement, category alignment is involved during adaptation to reinforce target feature discrimination by utilizing model prediction. However, there remain unexplored problems about pseudo-label inaccuracy incurred by wrong category predictions on target domain, and distribution deviation caused by overfitting on source domain. In this paper, we propose a model-agnostic two-stage learning framework, which greatly reduces flawed model predictions using soft pseudo-label strategy and avoids overfitting on source domain with a curriculum learning strategy. Theoretically, it successfully decreases the combined risk in the upper bound of expected error on the target domain. In the first stage, we train a model with distribution alignment-based UDA method to obtain soft semantic label on target domain with rather high confidence. To avoid overfitting on source domain, in the second stage, we propose a curriculum learning strategy to adaptively control the weighting between losses from the two domains so that the focus of the training stage is gradually shifted from source distribution to target distribution with prediction confidence boosted on the target domain. Extensive experiments on two well-known benchmark datasets validate the universal effectiveness of our proposed framework on promoting the performance of the top-ranked UDA algorithms and demonstrate its consistent superior performance.

Cite this article

ZHANG Shengjia(张晟嘉) , LIN Tiancheng(林天成) , XU Yi*(徐奕) . Boosting Unsupervised Domain Adaptation with Soft Pseudo-Label and Curriculum Learning[J]. Journal of Shanghai Jiaotong University(Science), 2023 , 28(6) : 703 -716 . DOI: 10.1007/s12204-022-2487-5

References

[1] SUN B, FENG J, SAENKO K. Return of frustratinglyeasy domain adaptation [C]//Thirtieth AAAI Conference on Artificial Intelligence. Phoenix, AZ, USA:AAAI, 2016: 2058-2065.[2] TORRALBA A, EFROS A A. Unbiased look atdataset bias [C]//CVPR 2011. Colorado Springs, CO,USA: IEEE, 2011: 1521-1528.[3] ZHU Y C, ZHUANG F Z, WANG J D, et al. Deep subdomain adaptation network for image classification [J].IEEE Transactions on Neural Networks and LearningSystems, 2021, 32(4): 1713-1722.[4] CUI S H, WANG S H, ZHUO J B, et al. Gradually vanishing bridge for adversarial domain adaptation [C]//2020 IEEE/CVF Conference on ComputerVision and Pattern Recognition. Seattle, WA, USA:IEEE, 2020: 12452-12461.[5] ZHANG W C, OUYANG W L, LI W, et al. Collaborative and adversarial network for unsupervised domain adaptation [C]//2018 IEEE/CVF Conference onComputer Vision and Pattern Recognition. Salt LakeCity, UT, USA: IEEE, 2018: 3801-3809.[6] KANG G L, JIANG L, YANG Y, et al. Contrastiveadaptation network for unsupervised domain adaptation [C]//2019 IEEE/CVF Conference on ComputerVision and Pattern Recognition. Long Beach, CA,USA: IEEE, 2019: 4888-4897.[7] LONG M, CAO Z, WANG J, et al. Conditional adversarial domain adaptation [M]//Advances in neuralinformation processing systems 31. Red Hook: CurranAssociates Inc., 2018: 1645-1655.[8] ZHANG Y, LIU T, LONG M, et al. Bridging theory and algorithm for domain adaptation [C]//36thInternational Conference on Machine Learning. LongBeach, CA, USA: PMLR, 2019: 7404-7413.[9] XIAO N, ZHANG L. Dynamic weighted learning for unsupervised domain adaptation [C]//2021IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA: IEEE, 2021:15237-15246.[10] WEI G Q, LAN C L, ZENG W J, et al. MetaAlign: coordinating domain alignment and classification for unsupervised domain adaptation [C]//2021 IEEE/CVFConference on Computer Vision and Pattern Recognition. Nashville, TN, USA: IEEE, 2021: 16638-16648.[11] SHARMA A, KALLURI T, CHANDRAKER M. Instance level affinity-based transfer for unsuperviseddomain adaptation [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville, TN, USA: IEEE, 2021: 5357-5367.[12] ZHONG L, FANG Z, LIU F, et al. How does the combined risk affect the performance of unsupervised domain adaptation approaches? [C]//35th AAAI Conference on Artificial Intelligence. Online: AAAI, 2021: 11079-11087.
[13] LI S, XIE M X, GONG K X, et al. Transferable semantic augmentation for domain adaptation [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA: IEEE, 2021: 11511-11520.
[14] BEN-DAVID S, BLITZER J, CRAMMER K, et al. A theory of learning from different domains [J]. Machine Learning, 2010, 79(1/2): 151-175.
[15] XU M H, ZHANG J, NI B B, et al. Adversarial domain adaptation with domain mixup [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 6502-6509.
[16] ZHANG Y B, DENG B, TANG H, et al. Unsupervised multi-class domain adaptation: Theory, algorithms, and practice [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(5): 2775-2792.
[17] GENG B, TAO D C, XU C. DAML: domain adaptation metric learning [J]. IEEE Transactions on Image Processing, 2011, 20(10): 2980-2989.
[18] LONG M, CAO Y, WANG J, et al. Learning transferable features with deep adaptation networks [C]//32 nd International Conference on Machine Learning. Lille, France: PMLA, 2015: 97-105.
[19] TZENG E, HOFFMAN J, ZHANG N, et al. Deep domain confusion: Maximizing for domain invariance [DB/OL]. (2014-12-10). https://arxiv.org/abs/1412.3474.
[20] ZHANG Y B, TANG H, JIA K, et al. Domainsymmetric networks for adversarial domain adaptation [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019: 5026-5035.
[21] PENG X C, BAI Q X, XIA X D, et al. Moment matching for multi-source domain adaptation [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE, 2019: 1406-1415.
[22] LI X D, HU Y, ZHENG J H, et al. Central moment discrepancy based domain adaptation for intelligent bearing fault diagnosis [J]. Neurocomputing, 2021, 429: 12- 24.
[23] PENG X C, SAENKO K. Synthetic to real adaptation with generative correlation alignment networks [C]//2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe, NV, USA: IEEE, 2018: 1982-1991.
[24] SUN B C, SAENKO K. Deep CORAL: correlation alignment for deep domain adaptation [M]//Computer vision — ECCV 2016 Workshops. Cham: Springer, 2016: 443-450.
[25] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks [J]. Communications of the ACM, 2020, 63(11): 139-144.
[26] GANIN Y, LEMPITSKY V. Unsupervised domain adaptation by backpropagation [C]//32nd International Conference on Machine Learning. Lille, France: PMLR, 2015: 1180-1189.
[27] GANIN Y, USTINOVA E, AJAKAN H, et al. Domainadversarial training of neural networks [J]. Journal of Machine Learning Research, 2016, 17(1): 2096-2030.
[28] WANG X M, LI L, YE W R, et al. Transferable attention for domain adaptation [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33: 5345-5352.
[29] MATSUURA T, HARADA T. Domain generalization using a mixture of multiple latent domains [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11749-11756.
[30] WEI Y Y, ZHANG Z, WANG Y, et al. DerainCycleGAN: Rain attentive CycleGAN for single image deraining and rainmaking [J]. IEEE Transactions on Image Processing, 2021, 30: 4788-4801.
[31] GAO R, HOU X S, QIN J, et al. Zero-VAE-GAN: Generating unseen features for generalized and transductive zero-shot learning [J]. IEEE Transactions on Image Processing, 2020, 29: 3665-3680.
[32] GAO X J, ZHANG Z, MU T T, et al. Self-attention driven adversarial similarity learning network [J]. Pattern Recognition, 2020, 105: 107331.
[33] PEI Z, CAO Z, LONG M, et al. Multi-adversarial domain adaptation [J]//Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 3211-3218.
[34] CHEN M H, ZHAO S, LIU H F, et al. Adversariallearned loss for domain adaptation [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 3521-3528.
[35] SAITO K, USHIKU Y, HARADA T. Asymmetric tritraining for unsupervised domain adaptation [C]//34th International Conference on Machine Learning. Sydney, Australia: PMLR, 2017: 2988-2997.
[36] XIE S, ZHENG Z, CHEN L, et al. Learning semantic representations for unsupervised domain adaptation [C]//35th International Conference on Machine Learning. Stockholm, Sweden: PMLR, 2018: 5423- 5432.
[37] CHEN C Q, XIE W P, HUANG W B, et al. Progressive feature alignment for unsupervised domain adaptation [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019: 627-636.
[38] PAN Y W, YAO T, LI Y H, et al. Transferrable prototypical networks for unsupervised domain adaptation [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019: 2234-2242.
[39] ZOU Y, YU Z D, VIJAYA KUMAR B V K, et al. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training [M]//Computer vision — ECCV 2018. Cham: Springer, 2018: 297-313.
[40] WANG Q, BRECKON T. Unsupervised domain adaptation via structured prediction based selective pseudolabeling [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 6243-6250.41] WANG M, DENG W H. Deep visual domain adaptation: A survey [J]. Neurocomputing, 2018, 312: 135- 153.
[42] HINTON G, VINYALS O, DEAN J. Distilling the Knowledge in a Neural Network [DB/OL]. (2015-05- 09). https://arxiv.org/abs/1503.02531.
[43] CHENG X, RAO Z F, CHEN Y L, et al. Explaining knowledge distillation by quantifying the knowledge [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2020: 12922-12932.
[44] YUAN L, TAY F E, LI G L, et al. Revisiting knowledge distillation via label smoothing regularization [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2020: 3902-3910.
[45] SAENKO K, KULIS B, FRITZ M, et al. Adapting visual category models to new domains [M]//Computer vision — ECCV 2010. Berlin, Heidelberg: Springer, 2010: 213-226.
[46] VENKATESWARA H, EUSEBIO J, CHAKRABORTY S, et al. Deep hashing network for unsupervised domain adaptation [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 5385-5394.
[47] ZHU Y C, ZHUANG F Z, WANG J D, et al. Multirepresentation adaptation network for cross-domain image classification [J]. Neural Networks, 2019, 119: 214-221.
[48] LONG M, ZHU H, WANG J, et al. Deep transfer learning with joint adaptation networks [C]//34th International Conference on Machine Learning. Sydney, Australia: PMLR, 2017: 2208-2217.
[49] BORGWARDT K M, GRETTON A, RASCH M J, et al. Integrating structured biological data by Kernel Maximum Mean Discrepancy [J]. Bioinformatics, 2006, 22(14): e49-e57.
[50] ZELLINGER W, GRUBINGER T, LUGHOFER E, et al. Central moment discrepancy (cmd) for domaininvariant representation learning [C]//International Conference on Learning Representations. Toulon, France: Universite de Montreal, 2017: 234-245.
[51] CHEN Q C, LIU Y, WANG Z W, et al. Re-weighted adversarial adaptation network for unsupervised domain adaptation [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 7976-7985.
[52] SANKARANARAYANAN S, BALAJI Y, CASTILLO C D, et al. Generate to adapt: Aligning domains using generative adversarial networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 8503-8512.
[53] VOLPI R, MORERIO P, SAVARESE S, et al. Adversarial feature augmentation for unsupervised domain adaptation [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 5495-5504.
[54] TZENG E, HOFFMAN J, SAENKO K, et al. Adversarial discriminative domain adaptation [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 2962- 2971.
[55] LIU H, LONG M, WANG J, et al. Transferable adversarial training: A general approach to adapting deep classifiers [C]//36th International Conference on Machine Learning. Long Beach, CA, USA: PMLR, 2019: 4013-4022.
[56] SAITO K, WATANABE K, USHIKU Y, et al. Maximum classifier discrepancy for unsupervised domain adaptation [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 3723-3732.
[57] LU Z H, YANG Y X, ZHU X T, et al. Stochastic classifiers for unsupervised domain adaptation [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, 2020: 9108-9117.
[58] HOFFMAN J, TZENG E, PARK T, et al. CyCADA: Cycle-consistent adversarial domain adaptation [C]//35th International Conference on Machine Learning. Stockholm, Sweden: PMLR, 2018: 1989- 1998.
[59] RUSSO P, CARLUCCI F M, TOMMASI T, et al. From source to target and back: Symmetric Bidirectional adaptive GAN [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 8099-8108.
[60] BOUSMALIS K, SILBERMAN N, DOHAN D, et al. Unsupervised pixel-level domain adaptation with generative adversarial networks [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 95-104.
[61] LIU M Y, TUZEL O. Coupled generative adversarial networks [M]//Advances in Neural Information Processing Systems 29. Red Hook: Curran Associates Inc., 2016: 469-477.
[62] KUMAR A, SATTIGERI P, WADHAWAN K, et al. Co-regularized alignment for unsupervised domain adaptation [C]//Advances in Neural Information Processing Systems 31. Red Hook: Curran Associates Inc., 2018: 543-555.
[63] ZHANG Y, DAVID P, GONG B Q. Curriculum domain adaptation for semantic segmentation of urban scenes [C]//2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2039- 2049.
[64] CHOI J, JEONG M, KIM T, et al Pseudo-labeling curriculum for unsupervised domain adaptation [DB/OL]. (2019-08-01). https://arxiv.org/abs/1908.00262.
[65] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778.
[66] CHEN X, WANG S, LONG M, et al. Transferability vs. discriminability: Batch spectral penalizationfor adversarial domain adaptation [C]//36th International Conference on Machine Learning. Long Beach, CA, USA: PMLR, 2019: 1081-1090.
[67] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database [C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009: 248-255.
[68] VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE [J]. Journal of Machine Learning Research, 2008, 9(11): 2579-2605.
[69] WU S, ZHONG J, CAO W M, et al. Improving domain-specific classification by collaborative learning with adaptation networks [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33: 5450- 5457.
[70] SUN S L, CAO Z H, ZHU H, et al. A survey of optimization methods from a machine learning perspective [J]. IEEE Transactions on Cybernetics, 2020, 50(8): 3668-3681.
[71] DAUPHIN Y, PASCANU R, GULCEHRE C, et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization [M]//Advances in Neural Information Processing Systems 27. Red Hook: Curran Associates Inc., 2014: 2933-2941.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References