上海交通大学学报 ›› 2021, Vol. 55 ›› Issue (9): 1158-1168.doi: 10.16183/j.cnki.jsjtu.2019.307

所属专题: 《上海交通大学学报》2021年12期专题汇总专辑 《上海交通大学学报》2021年“自动化技术、计算机技术”专题

• • 上一篇    下一篇

一种改进变换网络的域自适应语义分割网络

张峻宁1, 苏群星2(), 王成3, 徐超4, 李一宁5   

  1. 1.陆军工程大学, 石家庄 050003
    2.陆军指挥学院, 南京 210000
    3.国防大学联合作战学院, 石家庄 050003
    4.32181部队,西安, 710032
    5.军事科学院防化研究院, 北京 102205
  • 收稿日期:2019-10-23 出版日期:2021-09-28 发布日期:2021-10-08
  • 通讯作者: 苏群星 E-mail:LPZ20101796@qq.com
  • 作者简介:张峻宁(1992-),男,四川省巴中市人,博士生,主要从事深度学习,SLAM技术,计算机视觉与模式识别研究
  • 基金资助:
    国家自然科学基金资助项目(51205405);国家自然科学基金资助项目(51305454)

A Domain Adaptive Semantic Segmentation Network Based on Improved Transformation Network

ZHANG Junning1, SU Qunxing2(), WANG Cheng3, XU Chao4, LI Yining5   

  1. 1. Army Engineering University, Shijiazhuang 050003, China
    2. Army Command College, Nanjing 210000, China
    3. Joint Operations Academy, National Defense University, Shijiazhuang 050003, China
    4. 32181 Troops, Xi’an 710032, China
    5. Research Institute of Chemical Defense, Academy of Military Sciences, Beijing 102205, China
  • Received:2019-10-23 Online:2021-09-28 Published:2021-10-08
  • Contact: SU Qunxing E-mail:LPZ20101796@qq.com

摘要:

语义标签的人工标注成本高,耗时长,基于域自适应的非监督语义分割是非常必要的.针对间隙大的场景或像素易限制模型训练、降低语义分割精度的问题,通过分阶段训练和可解释蒙版消除大间隙图片和像素的干扰,提出了一种改进变换网络的域自适应语义分割网络(DA-SSN).首先,针对部分源图到目标图的域间隙大、网络模型训练困难的问题,利用训练损失阈值划分大间隙的源图数据集,提出一种分阶段的变换网络训练策略,在保证小间隙源图的语义对齐基础上,提高了大间隙源图的变换质量.然后,为了进一步缩小源图中部分像素与目标图域间间隙,提出一种可解释蒙版.通过预测每个像素在源图域和目标图域之间的间隙缩小置信度,忽略对应像素的训练损失,以消除大间隙像素对其他像素语义对齐的影响,使得模型训练只关注高置信度像素的域间隙.结果表明,所提算法相比于原始的域自适应语义分割网络的分割精度更高.与其他流行算法的结果相比,所提方法获得了更高质量的语义对齐,表明了所提方法精度高的优势.

关键词: 计算机视觉, 域自适应语义分割, 域间隙, 语义信息整合, 可解释蒙版, 分阶段训练

Abstract:

Due to the high cost and time-consumption of artificial semantic tags, domain-based adaptive semantics segmentation is very necessary. For scenes with large gaps or pixels, it is easy to limit model training and reduce the accuracy of semantic segmentation. In this paper, a domain adaptive semantic segmentation network (DA-SSN) using the improved transformation network is proposed by eliminating the interference of large gap pictures and pixels through staged training and interpretable masks. First, in view of the problem of large domain gaps from some source graphs to target graphs and the difficulty in network model training, the training loss threshold is used to divide the source graph dataset with large gaps, and a phased transformation network training strategy is proposed. Based on the ensurance of the semantic alignment of small gap source images, the transformation quality of large gap source images is improved. In addition, in order to further reduce the gap between some pixels in the source image and the target image area, an interpretable mask is proposed. By predicting the gap between each pixel in the source image domain and the target image domain, the confidence is reduced, and the training loss of the corresponding pixel is ignored to eliminate the influence of large gap pixels on the semantic alignment of other pixels, so that model training only focuses on the domain gap of high-confidence pixels. The results show that the proposed algorithm has a higher segmentation accuracy than the original domain adaptive semantic segmentation network. Compared with the results of other popular algorithms, the proposed method obtains a higher quality semantic alignment, which shows the advantages of the proposed method with high accuracy.

Key words: computer vision, domain adaptive semantics segmentation, domain gap, semantic information integration, interpretable mask, staged training

中图分类号: