Automation & Computer Technologies

High Resolution Remote Sensing Image Segmentation Method with Improved DeepLabv3+

Expand
  • 1. School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin 643002, Sichuan, China; 2. Artificial Intelligence Key Laboratory of Sichuan Province, Sichuan University of Science and Engineering, Yibin 643002, Sichuan, China; 3. Chengdu Tianxun Microsatellite Technology Co., Ltd., Chengdu 610200, China

Received date: 2023-06-29

  Accepted date: 2023-08-27

  Online published: 2024-04-22

Abstract

In order to address the challenges associated with poor semantic segmentation results of classical semantic segmentation networks in high-resolution remote sensing images, limited performance in complex scenes, a large number of network parameters, and high training costs, this study proposes an efficient segmentation method for high-resolution remote sensing images based on an improved DeepLabv3+ approach. The method focuses on three key aspects: reducing the number of network parameters, minimizing computation volume, and enhancing performance. First, the proposed method replaces the original DeepLabv3+ backbone network Xception, which is computationally heavy, with the lighter MobileNetV2 network for feature extraction. This substitution helps reduce the number of network parameters while maintaining effective feature extraction. Second, a lightweight convolutional block attention module (CBAM) is added after the feature extraction module to enhance the network’s feature extraction capability. The inclusion of CBAM further reduces the number of network parameters. Last, coordinate attention is introduced after the shallow features obtained from the feature extraction module. This addition allows the network to focus more on relevant features in the image, while disregarding irrelevant background information. Experimental results demonstrate the effectiveness of the proposed method. In the segmentation task of the high-resolution image dataset, the method achieves a mean intersection over union (mIoU) of 75.33%. This result surpasses mainstream semantic segmentation networks such as SegNet, PSPNet, and U-Net by 12.49%, 3.16%, and 1.62% respectively. Furthermore, the proposed model has a relatively low number of network parameters, with only 6.02 × 106 parameters, and a computation volume of 26.45 GFLOPs. This balance between computational efficiency and segmentation accuracy makes the model highly valuable for edge computing applications.

Cite this article

Tao Hongjie, Li Zhaofei, Qi Fei, Chen Jingjue, Zhou Hao . High Resolution Remote Sensing Image Segmentation Method with Improved DeepLabv3+[J]. Journal of Shanghai Jiaotong University(Science), 2026 , 31(2) : 348 -358 . DOI: 10.1007/s12204-024-2721-4

References

[1] LI D R, WANG M, JIANG J. China’s high-resolution optical remote sensing satellites and their mapping applications [J]. Geo-spatial Information Science, 2021, 24(1): 85-94.

[2] ZHANG J, JING H T, FAN S H. Sea-land segmentation for remote sensing imagery based on coastline database [J]. Electronic Measurement Technology, 2020, 43(23): 115-120 (in Chinese).

[3] MATIKAINEN L, KARILA K. Segment-based land cover mapping of a suburban area—Comparison of high-resolution remotely sensed datasets using classification trees and test field points [J]. Remote Sensing, 2011, 3(8): 1777-1804.

[4]        TIAN X, WANG L, DING Q. Review of image semantic segmentation based on deep learning [J]. Journal of Software, 2019, 30(2): 440-468 (in Chinese).

[5] ERUS G, LOMÉNIE N. How to involve structural modeling for cartographic object recognition tasks in high-resolution satellite images? [J]. Pattern Recognition Letters, 2010, 31(10): 1109-1119.

[6] OTSU N. A threshold selection method from gray-level histograms [J]. IEEE Transactions on Systems, Man, and Cybernetics, 1979, 9(1): 62-66.

[7] BEZDEK J C, EHRLICH R, FULL W. FCM: The fuzzy c-means clustering algorithm [J]. Computers & Geosciences, 1984, 10(2/3): 191-203.

[8] PENG B, ZHANG L, ZHANG D. A survey of graph theoretical approaches to image segmentation [J]. Pattern Recognition, 2013, 46(3): 1020-1038.

[9] MITRA P, SHANKAR B U, PAL S K. Segmentation of multispectral remote sensing images using active support vector machines [J]. Pattern Recognition Letters, 2004, 25(9): 1067-1074.

[10] POGGI G, SCARPA G, ZERUBIA J B. Supervised segmentation of remote sensing images based on a tree-structured MRF model [J]. IEEE Transactions on Geoscience and Remote Sensing, 2005, 43(8): 1901-1911.

[11] SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651.

[12] RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation[M]//Medical image computing and computer-assisted intervention – MICCAI 2015. Cham: Springer, 2015: 234-241.

[13] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.

[14] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6230-6239.

[15] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs [DB/OL]. (2014-12-22). https://arxiv.org/abs/1412.7062

[16] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.

[17] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.

[18] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [DB/OL]. (2017-06-17). https://arxiv.org/abs/1706.05587

[19] DU S J, DU S H, LIU B, et al. Incorporating DeepLabv3+ and object-based image analysis for semantic segmentation of very high resolution remote sensing images [J]. International Journal of Digital Earth, 2021, 14(3): 357-378.

[20] ZENG H B, PENG S Q, LI D X. Deeplabv3+ semantic segmentation model based on feature cross attention mechanism [J]. Journal of Physics: Conference Series, 2020, 1678(1): 012106.

[21] HUANG C, YANG J, LIU Y, et al. Remote sensing image segmentation algorithm based on improved DeeplabV3+[J]. Electronic Measurement Technology, 2022, 45(21): 148-155 (in Chinese).

[22] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13708-13717.

[23] WANG Z M, WANG J S, YANG K, et al. Semantic segmentation of high-resolution remote sensing images based on a class feature attention mechanism fused with Deeplabv3+ [J]. Computers & Geosciences, 2022, 158: 104969.

[24] GUO M H, LU C G, HOU Q B, et al. SegNeXt: Rethinking convolutional attention design for semantic segmentation [C]// 36th Conference on Neural Information Processing Systems. New Orleans: NIPS, 2022: 1-17.

[25] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]// Computer vision – ECCV 2018. Cham: Springer, 2018: 3-19.

[26] SANDLER M, HOWARD A G, ZHU M L, et al. Inverted residuals and linear bottlenecks: Mobile networks for classification, detection and segmentation [DB/OL]. (2018-01-13).  https://arxiv.org/abs/1801.04381

[27] TONG X Y, XIA G S, LU Q, et al. Land-cover classification with high-resolution remote sensing images using transferable deep models[J]. Remote Sensing of Environment, 2020, 237: 111322.


Outlines

/