Automation & Computer Technologies

YOLO-SDD: An Improved YOLOv5 for Storm Drain Detection in Street-Level View

Expand
  • 1. College of Information Technology, Shanghai Ocean University, Shanghai 201306, China; 2. Shanghai SenseTime Intelligent Technology Co., Ltd., Shanghai 200030, China

Received date: 2023-10-13

  Accepted date: 2024-01-25

  Online published: 2024-07-04

Abstract

Urban drainage pipe system is an important part of city management. Automated detection of the status of storm drain in street-level images through current technologies in computer vision and AI is an important aspect of smart city construction. In this paper, a framework based on YOLOv5s for storm drain detection (YOLOSDD) in street view is proposed. By analyzing the characteristics of small-scale targets, YOLO-SDD focuses on optimizing the Backbone network and its loss function. Series of experiments demonstrated that in the task of detecting different states of storm drain under various environmental conditions, the mean average precision (mAP@.5) of the YOLO-SDD can reach 89.6%, increasing by 2% compared with the baseline model YOLOv5s. In the presence and absence of occlusion, the average precision of storm drain detection increased by 0.9% and 3.1%, respectively. In addition, the effectiveness and generalization ability of YOLO-SDD were further validated using the storm drain dataset of Urbana-Champaign (SDUC) from Illinois, USA, and the dataset for object detection in aerial images (DOTA). Finally, this work has deployed the YOLO-SDD on the Android system, which verifies its ability of real-time detecting storm drain in different states in street scenes.

Cite this article

Wang Jing, Fang Zhiqiang, Li Qianqian, Tang Zhiwei, Huang Zhangyang, Hong Zhonghua, He Haiyang . YOLO-SDD: An Improved YOLOv5 for Storm Drain Detection in Street-Level View[J]. Journal of Shanghai Jiaotong University(Science), 2026 , 31(2) : 359 -374 . DOI: 10.1007/s12204-024-2749-5

References

[1] WANG J Q, YU C W, CAO S Y. Urban development in the context of extreme flooding events [J]. Indoor and Built Environment, 2021, 31: 3-6.

[2] WANG L C, LI J Z, DENG Z, et al. Spotting strategic storm drain inlets in flat urban catchments [J]. Journal of Hydrology, 2021, 600: 126504.

[3] WANG P, WANG H Y, LI X Y, et al. Small target detection algorithm based on transfer learning and deep separable network [J]. J Sensors, 2021, 2021: 1-10.

[4] LIU W, QUIJANO K, CRAWFORD M M. YOLOv5-tassel: Detecting tassels in RGB UAV imagery with improved YOLOv5 based on transfer learning [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 8085-8094.

[5] WANG J Y, YU N G. UTD-Yolov5: A real-time underwater targets detection method based on attention improved YOLOv5 [DB/OL]. (2022-07-02). http://arxiv.org/abs/2207.00837

[6]   LI D P, REN X M, YAN N N. Real-time detection of insulator drop string based on UAV aerial photography [J]. Journal of Shanghai Jiao Tong University, 2022, 56(8): 994-1003.

[7] NIIGAKI H, SHIMAMURA J, MORIMOTO M. Circular object detection based on separability and uniformity of feature distributions using Bhattacharyya Coefficient [C]// 21st International Conference on Pattern Recognition. Tsukuba: IEEE, 2012: 2009-2012.

[8] BARTOLI O, CHAHINIAN N, ALLARD A, et al. Manhole cover detection using a geometrical filter on very high resolution aerial and satellite images [C]//2015 Joint Urban Remote Sensing Event. Lausanne: IEEE, 2015: 1-4.

[9] PASQUET J, DESERT T, BARTOLI O, et al. Detection of manhole covers in high-resolution aerial images of urban areas by combining two methods [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2016, 9(5): 1802-1807.

[10] SULTANI W, MOKHTARI S, YUN H B. Automatic pavement object detection using superpixel segmentation combined with conditional random field [J]. IEEE Transactions on Intelligent Transportation Systems, 2018, 19(7): 2076-2085.

[11] WEI Z Y, YANG M M, WANG L Z, et al. Customized mobile LiDAR system for manhole cover detection and identification [J]. Sensors, 2019, 19(10): 2422.

[12] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.

[13] BOLLER D, MOY DE VITRY M, D WEGNER J, et al. Automated localization of urban drainage infrastructure from public-access street-level images [J]. Urban Water Journal, 2019, 16(7): 480-493.

[14] PICHAIKUTTY P. Detection of curbside storm drain from street level images using Faster R-CNN [D]. Ames: Iowa State University, 2020.

[15] SANTOS A, MARCATO JUNIOR J, DE ANDRADE SILVA J, et al. Storm-drain and manhole detection using the RetinaNet method [J]. Sensors, 2020, 20(16): 4450.

[16] MATTHEUWSEN L, BASSIER M, VERGAUWEN M. Storm drain detection and localisation on mobile LIDAR data using a pre-trained randla-net semantic segmentation network [J]. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2022, 43: 237-244.

[17] YU J, YE X J, TU Q. Traffic sign detection and recognition in multiimages using a fusion model with YOLO and VGG network [J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(9): 16632-16642.

[18] DANG T P, TRAN N T, TO V H, et al. Improved YOLOv5 for real-time traffic signs recognition in bad weather conditions [J]. The Journal of Supercomputing, 2023, 79(10): 10706-10724.

[19] HAN H, SUN X, CHEN Y, et al. Research on traffic sign detection based on SA-YOLOv5 [J]. Microelectronics and Computers, 2023, 40(2): 94-100.

[20] YANG G H, FENG W, JIN J T, et al. Face mask recognition system with YOLOV5 based on image recognition [C]//2020 IEEE 6th International Conference on Computer and Communications. Chengdu: IEEE, 2020: 1398-1404.

[21] LI X, WANG W H, HU X L, et al. Selective kernel networks [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 510-519.

[22] MUN S H, JUNG J W, HAN M H, et al. Frequency and multi-scale selective kernel attention for speaker verification [C]//2022 IEEE Spoken Language Technology Workshop. Doha: IEEE, 2023: 548-554.

[23] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 7464-7475.

[24] GEVORGYAN Z. SIoU loss: More powerful learning for bounding box regression [DB/OL]. (2022-05-25). https://arxiv.org/abs/2205.12740

[25] LI C, LI L, JIANG H, et al. YOLOv6: A single-stage object detection framework for industrial applications [DB/OL]. (2022-09-07). https://arxiv.org/abs/2209.02976

[26] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999-3007.

[27] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers [M]// Computer vision – ECCV 2020. Cham: Springer, 2020: 213-229.

[28] LIU Z, LIN Y T, CAO Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows [C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 9992-10002.

[29] LEI F, TANG F F, LI S H. Underwater target detection algorithm based on improved YOLOv5 [J]. Journal of Marine Science and Engineering, 2022, 10(3): 310.

[30] SRINIVAS A, LIN T Y, PARMAR N, et al. Bottleneck transformers for visual recognition [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 16514-16524.

[31] QIAN J J, LIN J, BAI D, et al. Omni-dimensional dynamic convolution meets bottleneck transformer: A novel improved high accuracy forest fire smoke detection model [J]. Forests, 2023, 14(4): 838.

[32] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13708-13717.

[33] HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.

[34] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [M]// Computer vision – ECCV 2018. Cham: Springer, 2018: 3-19.

[35] LIU Y C, SHAO Z R, TENG Y Y, et al. NAM: Normalization-based attention module [DB/OL]. (2021-11-24). http://arxiv.org/abs/2111.12419

[36] YANG L, ZHANG R Y, LI L, et al. SimAM: A simple, parameter-free attention module for convolutional neural networks [C]// 38th International Conference on Machine Learning. Online: PMLR, 2021: 11863-11874.

[37] LIU S T, HUANG D, WANG Y H. Receptive field block net for accurate and fast object detection [M]// Computer vision – ECCV 2018. Cham: Springer, 2018: 404-419.

[38] HE H, YANG D F, WANG S C, et al. Road extraction by using atrous spatial pyramid pooling integrated encoder-decoder network and structural similarity loss [J]. Remote Sensing, 2019, 11(9): 1015.

[39] ZHANG Y F, REN W Q, ZHANG Z, et al. Focal and efficient IOU loss for accurate bounding box regression [J]. Neurocomputing, 2022, 506: 146-157.

[40] XING Z W, KAN B, LIU Z S, et al. Airport pavement snow and ice state perception based on Improved YOLOX-s [J]. Journal of Shanghai Jiao Tong University, 2023, 57(10): 1292-1304 (in Chinese).

[41] ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: Faster and better learning for bounding box regression [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993-13000.

[42] TONG Z J, CHEN Y H, XU Z W, et al. Wise-IoU: Bounding box regression loss with dynamic focusing mechanism [DB/OL]. (2023-01-24). http://arxiv.org/abs/2301.10051


Outlines

/