基于改进DeepLabv3+的光伏电站道路识别方法

doi:10.16183/j.cnki.jsjtu.2022.224

上海交通大学学报 ›› 2024, Vol. 58 ›› Issue (5): 776-782.doi: 10.16183/j.cnki.jsjtu.2022.224

• 新型电力系统与综合能源 • 上一篇

基于改进DeepLabv3+的光伏电站道路识别方法

李翠明(), 王华, 徐龙儿, 王龙

兰州理工大学机电工程学院,兰州 730050

收稿日期:2022-06-17 修回日期:2022-07-30 接受日期:2022-10-17 出版日期:2024-05-28 发布日期:2024-06-17
作者简介:李翠明(1976-),副教授,主要从事移动机器人场景理解和导航方面的研究;E-mail: li_goddess@163.com.
基金资助:
甘肃省自然科学基金(18JR3RA139);国家自然科学基金(51765031)

Road Recognition Method of Photovoltaic Plant Based on Improved DeepLabv3+

LI Cuiming(), WANG Hua, XU Longer, WANG Long

School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China

Received:2022-06-17 Revised:2022-07-30 Accepted:2022-10-17 Online:2024-05-28 Published:2024-06-17

1. 776.pdf(164KB)

摘要/Abstract

摘要：

针对移动清洁机器人在光伏电站作业时需要精确快速识别道路的问题,提出一种改进的DeepLabv3+目标识别模型对光伏电站道路进行识别.首先,将原DeepLabv3+模型的主干网络替换为优化的MobileNetv2网络以降低模型复杂度;其次,采用异感受野融合和空洞深度可分离卷积结合的策略改进空洞空间金字塔池化(ASPP)结构,提高ASPP的信息利用率和模型训练效率;最后,引入注意力机制,提升模型识别精度.结果表明,改进后模型的平均像素准确率为98.06%,平均交并比为95.92%,相比于DeepLabv3+基础模型分别提高了1.79个百分点、2.44个百分点,且高于SegNet、UNet模型.同时,改进后的模型参数量小,实时性好,能够更好地实现光伏电站移动清洁机器人的道路识别.

关键词: 光伏电站, 道路识别, DeepLabv3+模型, 注意力机制, MobileNetv2

Abstract:

Aiming at the problem that mobile cleaning robot needs to identify road accurately and quickly when it operates in photovoltaic plants, a target recognition model of improved DeepLabv3+ to identify the roads within photovoltaic plants is proposed. First, the backbone network of the original DeepLabv3+ model is replaced with an optimized MobileNetv2 network to reduce complexity. Then, the strategy that combines diverse receptive field fusion with depth separable convolution is employed, which enhances the atrous spatial pyramid pooling (ASPP) structure and improves the information utilization of ASPP and the training efficiency of model. Finally, the attention mechanism is introduced to improve the segmentation accuracy of the model. The results show that the average pixel accuracy of the improved model is 98.06%, and the average intersection over union is 95.92%, which are 1.79 percentage points and 2.44 percentage points higher than those of the DeepLabv3+ basic model, and SegNet and UNet models. Furthermore, the improved model has fewer parameters and a good real-time performance, which can better realize the road recognition of mobile cleaning robot of photovoltaic plants.

Key words: photovoltaic plants, road recognition, DeepLabv3+ model, attention mechanism, MobileNetv2

中图分类号:

TP242.6

李翠明, 王华, 徐龙儿, 王龙. 基于改进DeepLabv3+的光伏电站道路识别方法[J]. 上海交通大学学报, 2024, 58(5): 776-782.

LI Cuiming, WANG Hua, XU Longer, WANG Long. Road Recognition Method of Photovoltaic Plant Based on Improved DeepLabv3+[J]. Journal of Shanghai Jiao Tong University, 2024, 58(5): 776-782.

图/表 9

图1

图2

表1

图3

图4

图5

图6

图7

表2

参考文献 15

[1]	KONG H, AUDIBERT J Y, PONCE J. General road detection from a single image[J]. IEEE Transactions on Image Processing, 2010, 19(8): 2211-2220. doi: 10.1109/TIP.2010.2045715 pmid: 20371404
[2]	方浩, 贾睿, 卢嘉鹏. 基于颜色和纹理特征的道路图像分割[J]. 北京理工大学学报, 2010, 30(8): 934-939.
	FANG Hao, JIA Rui, LU Jiapeng. Segmentation of full vision images based on colour and texture features[J]. Transactions of Beijing Institute of Technology, 2010, 30(8): 934-939.
[3]	吴骅跃, 段里仁. 基于RGB熵和改进区域生长的非结构化道路识别方法[J]. 吉林大学学报(工学版), 2019, 49(3): 727-735.
	WU Huayue, DUAN Liren. Unstructured road detection method based on RGB entropy and improved region growing[J]. Journal of Jilin University (Engineering and Technology Edition), 2019, 49(3): 727-735.
[4]	SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651. doi: 10.1109/TPAMI.2016.2572683 pmid: 27244717
[5]	BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. doi: 10.1109/TPAMI.2016.2644615 pmid: 28060704
[6]	RONNEBERGER O, FISCHER P, BROX T. UNet: Convolutional networks for biomedical image segmentation[C]// International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham, Switzerland: Springer, 2015: 234-241.
[7]	ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 6230-6239.
[8]	CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Proceedings of the European Conference on Computer Vision. Cham, Switzerland: Springer, 2018: 833-851.
[9]	CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. (2017-01-01) [2021-04-08]. https://arxiv.org/abs/1706.05587.
[10]	CHOLLET F. Xception: Deep learning with depthwise separable convolutions[C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 1251-1258.
[11]	BAHETI B, INNANI S, GAJRE S, et al. Semantic scene segmentation in unstructured environment with modified DeepLabV³⁺[J]. Pattern Recognition Letters, 2020, 138: 223-229.
[12]	LIU R R, HE D Z. Semantic segmentation based on Deeplabv3+ and attention mechanism[C]// 2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference. Chongqing, China: IEEE, 2021: 255-259.
[13]	SANDLER M, HOWARD A, ZHU M L, et al.MobileNetV2: Inverted residuals and linear bottle-necks[C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake, USA: IEEE, 2018: 4510-4520.
[14]	HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17)[2021-04-08]. https://arxiv.org/abs/1704.04861.
[15]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]// Proceedings of the European Conference on Computer Vision. Cham, Switzerland: Springer, 2018: 3-19.

输入	网络层	输出步长	t	c	n	s	r
224×224×3	conv2d	2	—	32	1	2	1
112×112×32	bottleneck	2	1	16	1	1	1
112×112×16	bottleneck	4	6	24	2	2	1
56×56×24	bottleneck	8	6	32	3	2	1
28×28×32	bottleneck	16	6	64	4	2	1
28×28×64	bottleneck	16	6	96	3	1	1
14×14×96	bottleneck	16	6	160	3	1	2
7×7×160	bottleneck	16	6	320	1	1	4

模型	MPA/%	MIoU/%	单张图片推理时间/ms	总参数量× 10^-6
SegNet	93.84	91.42	121	14.86
UNet	94.73	92.05	125	17.30
原始Deeplabv3+	96.27	93.48	156	41.25
改进Deeplabv3+	98.06	95.92	112	2.28

基于改进DeepLabv3+的光伏电站道路识别方法

Road Recognition Method of Photovoltaic Plant Based on Improved DeepLabv3+

RichHTML

PDF (PC)

补充材料

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 15

相关文章 15

编辑推荐

Metrics

本文评价

[1]	. 迁移学习和注意机制融合用于CT图像COVID-19病灶分割的计算机辅助诊断[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(3): 566-581.
[2]	王可, 刘奕阳, 杨杰, 鲁爱国, 李哲, 徐明亮. 基于自适应特征增强和融合的舰载机着舰拉制状态识别[J]. 上海交通大学学报, 2025, 59(2): 274-282.
[3]	徐旺旺1,2，许良凤1,2，刘宁徽3，律娜3. 基于多注意力卷积神经网络的乳腺癌组织学图像诊断[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 91-106.
[4]	丁黎辉1, 2, 付立军1, 3, 杨光4, 5, 6, 万林4, 5, 常志军7. 基于视频的婴儿癫痫性痉挛综合征检测：建模、检测与评估[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 1-9.
[5]	李楚晨, 唐善军, 赵冰青. 一种基于无人机探测图像区块信息的弱小目标检测算法[J]. 空天防御, 2025, 8(1): 41-47.
[6]	李利娟, 刘海, 刘红良, 张青松, 陈永东. 融合外部注意力机制的序列到点非侵入式负荷分解[J]. 上海交通大学学报, 2024, 58(6): 846-854.
[7]	周成, 蒋祖华. 融入优质主题和注意力机制的设计规范命名实体识别方法[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 1169-1180.
[8]	彭诗玮1, 张希1, 朱旺旺1, 窦瑞2. 融合乘客感受量化指标的智能汽车舒适性研究[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 1063-1070.
[9]	鄢丛强1,2, 郭正玉3,4, 蔡云泽 1,2. 基于改进CycleGAN的SAR图像舰船尾迹数据增强[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 702-711.
[10]	黄权印, 蔡益朝, 李浩, 唐晓, 王辰洋. 基于改进注意力机制的自适应航迹预测方法[J]. 空天防御, 2024, 7(3): 94-101.
[11]	陈昊蓝, 靳冰莹, 刘亚东, 钱庆林, 王鹏, 陈艳霞, 于希娟, 严英杰. 基于门控循环注意力网络的配电网故障识别方法[J]. 上海交通大学学报, 2024, 58(3): 295-303.
[12]	顾星海，花豹，刘亚辉，孙学民，鲍劲松. 面向装配工艺文档的装配语义实体识别与关系构建方法[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 537-556.
[13]	张彦军1,4,5,6,7, 王碧云2,3 , 蔡云泽1,4,5,6,7. 基于注意力的多通道网络红外弱小目标检测[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 414-427.
[14]	陈坤1, 2, 赵旭1, 董春玉1, 邸子超1, 陈宗枝1. 基于滤波器预测的抗遮挡目标跟踪算法[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 400-413.
[15]	曾志贤，曹建军，翁年凤，袁震，余旭. 基于细粒度联合注意力机制的图像-文本跨模态实体分辨[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(6): 728-737.