随机平视摄像条件下的路边车辆违停检测

doi:10.16183/j.cnki.jsjtu.2023.578

摘要/Abstract

摘要：

查处车辆违停是城市交通管理的重要内容.鉴于人工执法耗时耗力、定点监控抓拍覆盖范围有限等问题,探索更为灵活高效的自动检测方法具有现实意义.提出一种适用于路面移动载体的非停留式、一次完成的巡航检测技术.在平视且随机拍摄角度条件下采集并构建车辆违停图像数据集XMUT-VPI,为研究提供数据基础.通过构建多任务神经网络(MTPN)作为编码器,提取违停判断所需的关键要素信息;借助自主设计的可变形大核特征聚合模块(DLKA-C2f)和跨任务交互注意力机制(CTIAM),实现了90.3%的最高目标平均检测准确率、4.4%的最小轮胎触地点平均定位误差,以及78.5%的次优车位线分割平均交并比精度.设计高效解码器来进一步提取车位线骨架特征,拟合主车位可视区域,匹配目标车辆,解析轮胎触地点与车位的位置关系,进而实现对违法停车、不当停车和规范停车3类典型行为的判定.实验结果表明,在各类复杂干扰情况下,该算法的综合准确率达到98.1%,领先现有主流方法,可为违停的全自动路面巡航治理提供技术支持.

关键词: 深度学习, 车辆违停, 目标检测, 关键点定位, 语义分割

Abstract:

Investigation and punishment of vehicle parking violations is important in urban traffic management. Considering the time-consuming and labor-intensive nature of manual law enforcement, as well as the limited scope of fixed camera monitoring and detecting, exploring more flexible and efficient automatic detection methods has a great practical significance. Thus, a cruise detection technology is proposed, which is suitable for mobile carriers requiring no stopping and can be completed in a single pass. First, a vehicle parking violation image dataset named XMUT-VPI is collected and constructed under the conditions of approximate horizontal views and random shooting angles, laying a data foundation for the research. Then, a multitask parking network (MTPN) is constructed as an encoder to extract the key element information required for stop violation judgment. With the aid of the self-designed deformable large kernel feature aggregation module (DLKA-C2f) and cross-task interaction attention mechanism (CTIAM), a highest average detection accuracy of 90.3%, a minimum average positioning error of 4.4%, and a suboptimal average segmentation intersection ratio accuracy of 78.5% are achieved. Finally, an efficient decoder is designed to further extract the skeleton features of the parking space line and fit the visible area of the main parking space, which helps match the target vehicle and analyzes the positional condition between its tire ground-touching points and the main parking space. In addition, a judgment principle is provided for three typical behaviors of illegal parking, improper parking, and standardized parking. Experimental results show that the algorithm attains a comprehensive accuracy rate of 98.1% for vehicle parking violation detections across diverse complex interference scenarios, which outperforms existing mainstream methods and can provide technical supports for fully automate road cruise management of parking violatic.

Key words: deep learning, vehicle parking violation, object detection, key point localization, semantic segmentation

中图分类号:

TP391.4

詹泽辉, 钟铭恩, 袁彬淦, 谭佳威, 范康. 随机平视摄像条件下的路边车辆违停检测[J]. 上海交通大学学报, 2025, 59(10): 1568-1580.

ZHAN Zehui, ZHONG Ming’en, YUAN Bingan, TAN Jiawei, FAN Kang. Detection of Roadside Vehicle Parking Violations Under Random Horizontal Camera Condition[J]. Journal of Shanghai Jiao Tong University, 2025, 59(10): 1568-1580.

图/表 16

图1

图2

图3

表1

图4

表2

图5

图6

图7

图8

图9

表3

表4

图10

表5

图11

参考文献 26

[1]	田爱军, 蔡旭阳, 陈玮, 等. 无人机端路面车辆违停检测及取证系统[J]. 测控技术, 2021, 40(5): 67-74.
	TIAN Aijun, CAI Xuyang, CHEN Wei, et al. Vehicle illegal parking detection and evidence collection system on UAV[J]. Measurement & Control Technology, 2021, 40(5): 67-74.
[2]	TANG H R, PENG A M, ZHANG D M, et al. SSD real-time illegal parking detection based on contextual information transmission[J]. Computers, Materials & Continua, 2020, 62(1): 293-307.
[3]	PENG X G, SONG R, CAO Q, et al. Real-time illegal parking detection algorithm in urban environments[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(11): 20572-20587.
[4]	吴志华. 基于深度学习的路侧车辆违停检测技术研究[D]. 厦门: 厦门理工学院, 2023.
	WU Zhihua. Research of a deep learning-based illegal parking vehicle detection and license plate recognition algorithm[D]. Xiamen: Xiamen University of Technology, 2023.
[5]	赵逸如, 刘正熙, 熊运余, 等. 基于目标检测和语义分割的人行道违规停车检测[J]. 现代计算机, 2020(9): 82-88.
	ZHAO Yiru, LIU Zhengxi, XIONG Yunyu, et al. Detection of illegal sidewalk parking based on object detection and semantic segmentation[J]. Modern Computer, 2020(9): 82-88.
[6]	YANG Q, YU L F. Recognition of taxi violations based on semantic segmentation of PSPNet and improved YOLOv3[J]. Scientific Programming, 2021, 2021: 4520190.
[7]	LIANG X, WU Y, HAN J, et al. Effective adaptation in multi-task co-training for unified autonomous driving[J]. Advances in Neural Information Processing Systems, 2022, 35: 19645-19658.
[8]	GUO M H, LU C Z, LIU Z N, et al. Visual attention network[J]. Computational Visual Media, 2023, 9(4): 733-752.
[9]	李擎, 皇甫玉彬, 李江昀, 等. UConvTrans: 全局和局部信息交互的双分支心脏图像分割[J]. 上海交通大学学报, 2023, 57(5): 570-581. doi: 10.16183/j.cnki.jsjtu.2022.088
	LI Qing, HUANGPU Yubin, LI Jiangyun, et al. UConvTrans: A dual-flow cardiac image segmentation network by global and local information integration[J]. Journal of Shanghai Jiao Tong University, 2023, 57(5): 570-581.
[10]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE, 2023: 7464-7475.
[11]	万安平, 杨洁, 缪徐, 等. 基于注意力机制与神经网络的热电联产锅炉负荷预测[J]. 上海交通大学学报, 2023, 57(3): 316-325. doi: 10.16183/j.cnki.jsjtu.2021.346
	WAN Anping, YANG Jie, MIAO Xu, et al. Boiler load forecasting of CHP plant based on attention mechanism and deep neural network[J]. Journal of Shanghai Jiao Tong University, 2023, 57(3): 316-325.
[12]	WANG W H, DAI J F, CHEN Z, et al. InternImage: Exploring large-scale vision foundation models with deformable convolutions[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE, 2023: 14408-14419.
[13]	ZHANG X H, CHEN Y, ZHANG H F, et al. When visual disparity generation meets semantic segmentation: A mutual encouragement approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(3): 1853-1867.
[14]	FENG Z H, KITTLER J, AWAIS M, et al. Rectified wing loss for efficient and robust facial landmark localisation with convolutional neural networks[J]. International Journal of Computer Vision, 2020, 128(8): 2126-2145.
[15]	GE Z, LIU S T, WANG F, et al. YOLOX:Exceeding YOLO series in 2021[DB/OL]. (2021-07-18) [2023-11-01]. http://arxiv.org/abs/2107.08430.
[16]	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[M]//Lecture notes in computer science. Cham: Springer International Publishing, 2020: 213-229.
[17]	SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019: 5686-5696.
[18]	XU Y F, ZHANG J, ZHANG Q M, et al. ViTPose: Simple vision transformer baselines for human pose estimation[DB/OL]. (2022-04-26) [2023-11-01]. http://arxiv.org/abs/2204.12484.
[19]	QIN X B, ZHANG Z C, HUANG C Y, et al. U²-Net: Going deeper with nested U-structure for salient object detection[J]. Pattern Recognition, 2020, 106: 107404.
[20]	YANG Z, PENG X B, YIN Z J, et al. Deeplab_v3_plus-net for image semantic segmentation with channel compression[C]// 2020 IEEE 20th International Conference on Communication Technology. Nanning, China: IEEE, 2020: 1320-1324.
[21]	LI H X, SU F L. A multi-target ISAR imaging method based on Zhang-Suen thinning and radon transform[C]// 2022 15th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics. Beijing, China: IEEE, 2022: 1-5.
[22]	GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]// 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 2012: 3354-3361.
[23]	CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 3213-3223.
[24]	YU F, CHEN H F, WANG X, et al. BDD100K: A diverse driving dataset for heterogeneous multitask learning[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 2633-2642.
[25]	庄建军, 徐子恒, 张若愚. 基于改进的YOLOv5模型和射线法的车辆违停检测[J]. 南京信息工程大学学报, 2024, 16(3): 341-351.
	ZHUANG Jianjun, XU Ziheng, ZHANG Ruoyu. Vehicle violation detection based on improved YOLOv5 model and radiometric method[J]. Journal of Nanjing University of Information Science & Technology, 2024, 16(3): 341-351.
[26]	邵怡文. 基于图像处理技术的违法车辆自动检测系统的优化研究[D]. 青岛: 青岛理工大学, 2021.
	SHAO Yiwen. Research on optimization of automatic detection system of illegal vehicles based on image processing technology[D]. Qingdao: Qingdao University of Science and Technology, 2021.

组别	α	β	γ	准确率/%
1	0.11	0.69	0.20	96.9
2	0.13	0.67	0.20	97.5
3	0.15	0.65	0.20	98.1
4	0.15	0.55	0.30	97.3
5	0.16	0.55	0.29	96.6
6	0.18	0.57	0.25	96.1

算法	mAP/%	R/%	NME/ %	mIoU/ %	推理速度/ (帧·s^-1)
YOLOX^[15]	86.5	83.7	-	-	85.3
DETR^[16]	80.2	88.1	-	-	28.9
HRNet^[17]	-	-	8.1	-	88.7
ViTPose^[18]	-	-	5.5	-	82.5
U2-Net^[19]	-	-	-	77.9	40.2
DeepLabv3+^[20]	-	-	-	79.6	32.2
YOLOv8^[10]	87.2	86.2	7.2	75.8	78.2
MTPN-1	89.7	85.5	5.1	77.3	75.4
MTPN-2	88.5	84.9	5.6	77.8	76.2
MTPN	90.3	86.7	4.4	78.5	73.6

环境	类别	样本数	准确率/%
白天、晴	规范停车	1457	98.8
	不当停车	1315	97.4
	违法停车	1253	98.0
	合计	4025	98.1
白天、雨	规范停车	1309	98.4
	不当停车	1283	97.0
	违法停车	1390	97.7
	合计	3982	97.7
夜间、晴	规范停车	1465	99.3
	不当停车	1217	97.6
	违法停车	1219	98.8
	合计	3901	98.6
夜间、雨	规范停车	1185	98.6
	不当停车	1047	97.3
	违法停车	1346	98.1
	合计	3578	98.0
总计		15486	98.1

干扰类型	类别	样本数	准确率/%
遮挡	规范停车	2246	97.2
	不当停车	2170	97.0
	合计	4416	97.1
破损	规范停车	1986	98.1
	不当停车	2072	98.0
	合计	4058	98.0
脏污	规范停车	1962	97.8
	不当停车	1898	97.6
	合计	3860	97.7
总计		12334	97.6

算法	XMUT-VPI		KITTI		Cityscapes		BDD100K		平均帧率/ (帧·s^-1)	平均耗时/ (ms·帧^-1)
算法	准确率/%	帧率/ (帧·s^-1)	准确率/%	帧率/ (帧·s^-1)	准确率/%	帧率/ (帧·s^-1)	准确率/%	帧率/ (帧·s^-1)	平均帧率/ (帧·s^-1)	平均耗时/ (ms·帧^-1)
YOLOv5+Transformer^[3]	92.3	36.2	90.2	34.1	91.6	36.3	90.4	36.5	35.7	28.0
PP-YOLOE+Transformer^[4]	95.7	37.1	93.6	35.2	93.5	37.0	93.1	37.2	36.6	27.3
YOLOv3+DeepLabv3+^[5]	88.4	24.9	88.3	22.7	86.4	24.5	87.7	25.0	24.4	41.0
PSPNet^[6]	90.3	37.6	88.9	35.3	90.7	37.5	87.8	37.7	37.0	27.0
YOLOv5+射线法^[25]	86.7	49.5	85.5	47.5	84.9	49.6	84.5	49.5	49.0	20.4
HOG+LBP+PCA+SVM^[26]	87.2	54.7	85.4	52.7	84.6	54.9	85.1	54.8	54.2	18.5
MTPN +Decoder	98.1	42.3	97.5	40.2	96.7	42.2	97.2	42.5	41.8	23.9