Automation & Computer Technologies

Recognition of Pedestrians’ Street-Crossing Intentions Based on Skeleton Features

Expand
  • 1. School of Mechanical Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; 2. Shandong Technician College of Transportation, Linyi 276002, Shandong, China; 3. School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China

Received date: 2023-08-17

  Accepted date: 2023-09-07

  Online published: 2024-01-16

Abstract

An integrated method is proposed to solve the problem of frequent conflicts between autonomous vehicles and pedestrians in the street crossing scene. The method involves pedestrian detection, tracking, and intention recognition. First, an enhanced YOLOv8 is introduced by combining the C2f CA module to achieve accurate pedestrian detection, tracking and pose estimation. Second, a variety of intention recognition features are proposed to characterize the position and pose of pedestrians in spatial and time domains. Finally, by taking the feature data as input for the base learners, the intention classification model is proposed based on the Stacking model with SVM, KNN, and random forest as the base learners and XGBoost as the meta learner. The experimental results show that the enhanced YOLOv8 improves the detection accuracy by 5.4% compared with the original model, and the intention recognition based on the Stacking model can achieve 94.0% accuracy on the JAAD dataset, which is improved by more than 3.4% compared with the existing intention recognition models. Furthermore, when different parts of a pedestrian are occluded, the accuracy of the Stacking model still reaches 65.8%—73.3%, which verifies the robustness of the proposed model. The proposed model provides reliable inputs for decision planning of autonomous vehicles, which is conducive to improving the safety of self-driving.

Cite this article

Lu Jushou, Chen Hao, Bai Yuchuan, Hu Chuan, Zhang Xi . Recognition of Pedestrians’ Street-Crossing Intentions Based on Skeleton Features[J]. Journal of Shanghai Jiaotong University(Science), 2026 , 31(2) : 305 -318 . DOI: 10.1007/s12204-024-2700-9

References

[1] HU Y C, LI M K. Challenges and responses of self-driving vehicles to road traffic safety law [J]. Journal of Shanghai Jiao Tong University (Philosophy and Social Sciences), 2019, 27(1): 44-53 (in Chinese).

[2] ZHOU M C. Criminal liability of traffic accident caused by self-driving vehicles [J]. Journal of Shanghai Jiao Tong University (Philosophy and Social Sciences), 2019, 27(1): 36-43 (in Chinese).

[3] YANG B, FAN F C, YANG J C, et al. Recognition of pedestrians’ street-crossing intentions based on action prediction and environment context [J]. Automotive Engineering, 2021, 43(7): 1066-1076 (in Chinese).

[4] WANG R P, CUI Y, SONG X A, et al. Multi-information-based convolutional neural network with attention mechanism for pedestrian trajectory prediction [J]. Image and Vision Computing, 2021, 107: 104110.

[5] ABUGHALIEH K M, ALAWNEH S G. Predicting pedestrian intention to cross the road [J]. IEEE Access, 2020, 8: 72558-72569.

[6] FANG H S, LI J F, TANG H Y, et al. AlphaPose: Whole-body regional multi-person pose estimation and tracking in real-time [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 7157-7173.

[7] OSOKIN D. Real-time 2D multi-person pose estimation on CPU: Lightweight OpenPose [DB/OL]. (2018-11-29). https://arxiv.org/abs/1811.12004

[8] GHORI O, MACKOWIAK R, BAUTISTA M, et al. Learning to forecast pedestrian intention from pose dynamics [C]//2018 IEEE Intelligent Vehicles Symposium. Changshu: IEEE, 2018: 1277-1284.

[9] HU Y Z, JIANG T, LIU X, et al. Pedestrian-crossing intention-recognition based on dual-stream adaptive graph-convolutional neural-network [J]. Journal of Automotive Safety and Energy, 2022, 13(2): 325-332 (in Chinese).

[10] LÜ C, CUI G G, MENG X H, et al. Graph representation method for pedestrian intention recognition of intelligent vehicle [J]. Transactions of Beijing Institute of Technology, 2022, 42(7): 688-695 (in Chinese).

[11] ZHANG Y F, SUN P Z, JIANG Y, et al. ByteTrack: multi-object tracking by associating every detection box [M]// Computer vision – ECCV 2022. Cham: Springer, 2022: 1-21.

[12] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13708-13717.

[13] HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.

[14] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [M]// Computer vision – ECCV 2018. Cham: Springer, 2018: 3-19.

[15] LI C X, LU S B, ZHANG B H, et al. Human-vehicle steering collision avoidance path planning based on pedestrian location prediction [J]. Automotive Engineering, 2021, 43(6): 877-884 (in Chinese).

[16] NAVEED H, KHAN G, KHAN A U, et al. Human activity recognition using mixture of heterogeneous features and sequential minimal optimization [J]. International Journal of Machine Learning and Cybernetics, 2019, 10(9): 2329-2340.

[17] WANG J, LIU Z C, WU Y, et al. Mining actionlet ensemble for action recognition with depth cameras [C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 1290-1297.

[18] SUN J H, GE H Y, ZHANG Z H. AS-YOLO: An improved YOLOv4 based on attention mechanism and SqueezeNet for person detection [C]//2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference. Chongqing: IEEE, 2021: 1451-1456.

[19] LI D P, REN X M, YAN N N. Real-time detection of insulator drop string based on UAV aerial photography [J]. Journal of Shanghai Jiao Tong University, 2022, 56(8): 994-1003 (in Chinese).

[20] GIRSHICK R. Fast R-CNN [C]//2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.

[21] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [M]// Computer vision – ECCV 2016. Cham: Springer, 2016: 21-37.

[22] XING Z W, KAN B, LIU Z S, et al. Airport pavement snow and ice state perception based on improved YOLOX-S [J]. Journal of Shanghai Jiao Tong University, 2023, 57(10): 1292-1304 (in Chinese).

[23] ZHANG S L, ABDEL-ATY M, WU Y N, et al. Pedestrian crossing intention prediction at red-light using pose estimation [J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(3): 2331-2339.

[24] YAN S J, XIONG Y J, LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 7444-7452.


Outlines

/