J Shanghai Jiaotong Univ Sci ›› 2026, Vol. 31 ›› Issue (2): 348-358.doi: 10.1007/s12204-024-2721-4

• Automation & Computer Technologies • Previous Articles     Next Articles

High Resolution Remote Sensing Image Segmentation Method with Improved DeepLabv3+

改进DeepLabv3+的高分辨率遥感图像分割方法

陶洪洁1,2,李兆飞1,2 ,祁飞3 ,陈景珏3 ,周豪1,2   

  1. 1. School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin 643002, Sichuan, China; 2. Artificial Intelligence Key Laboratory of Sichuan Province, Sichuan University of Science and Engineering, Yibin 643002, Sichuan, China; 3. Chengdu Tianxun Microsatellite Technology Co., Ltd., Chengdu 610200, China
  2. 1. 四川轻化工大学 自动化与信息工程学院,四川 宜宾 644000;2. 四川轻化工大学 人工智能四川省重点实验室,四川宜宾 644000;3. 成都天巡微小卫星科技有限公司,成都 610200
  • Received:2023-06-29 Accepted:2023-08-27 Online:2026-04-01 Published:2024-04-22

Abstract: In order to address the challenges associated with poor semantic segmentation results of classical semantic segmentation networks in high-resolution remote sensing images, limited performance in complex scenes, a large number of network parameters, and high training costs, this study proposes an efficient segmentation method for high-resolution remote sensing images based on an improved DeepLabv3+ approach. The method focuses on three key aspects: reducing the number of network parameters, minimizing computation volume, and enhancing performance. First, the proposed method replaces the original DeepLabv3+ backbone network Xception, which is computationally heavy, with the lighter MobileNetV2 network for feature extraction. This substitution helps reduce the number of network parameters while maintaining effective feature extraction. Second, a lightweight convolutional block attention module (CBAM) is added after the feature extraction module to enhance the network’s feature extraction capability. The inclusion of CBAM further reduces the number of network parameters. Last, coordinate attention is introduced after the shallow features obtained from the feature extraction module. This addition allows the network to focus more on relevant features in the image, while disregarding irrelevant background information. Experimental results demonstrate the effectiveness of the proposed method. In the segmentation task of the high-resolution image dataset, the method achieves a mean intersection over union (mIoU) of 75.33%. This result surpasses mainstream semantic segmentation networks such as SegNet, PSPNet, and U-Net by 12.49%, 3.16%, and 1.62% respectively. Furthermore, the proposed model has a relatively low number of network parameters, with only 6.02 × 106 parameters, and a computation volume of 26.45 GFLOPs. This balance between computational efficiency and segmentation accuracy makes the model highly valuable for edge computing applications.

Key words: remote sensing image, DeepLabv3+, MobileNetV2, attention mechanism, semantic segmentation

摘要: 针对经典语义分割网络在高分辨率遥感图像语义分割效果不佳、复杂场景下分割性能受限、网络参数量多,训练网络代价高等问题,从网络参数量、计算量和性能三个方面综合考虑,提出一种基于改进DeepLabv3+的高分辨率遥感图像高效分割方法。该方法首先使用更轻量级的MobileNetV2网络替换DeepLabv3+原始主干网络Xception进行特征提取;其次在特征提取模块获得的深层有效特征之后加入轻量级的通用卷积注意力模块(CBAM),在减少网络参数量的同时增强网络特征提取能力;最后在特征提取模块获得的浅层特征后引入坐标注意力机制,使其更关注图像中有效的特征信息,忽略无关的背景信息。实验结果表明,该方法在高分图像数据集分割任务中mIoU达到75.33%,分别比SegNet、PspNet和U-Net等主流语义分割网络高12.49%、3.16%和1.62%,同时该模型的网络参数量为6.02×106,浮点运算量为26.45 GFLOPs,在计算效率和分割精度之间达到了较好的平衡,对边缘计算具有较高的应用价值。

关键词: 遥感图像,DeepLabv3+,MobileNetV2,注意力机制,语义分割

CLC Number: