Computing & Computer Technologies

CenterLineFormer: Road Centerlines Graph Generation with Single Onboard Camera

Expand
  • School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China

Received date: 2023-01-06

  Accepted date: 2023-03-01

  Online published: 2024-01-05

Abstract

As autonomous driving systems advance rapidly, there is a surge in demand for high-definition (HD) maps that provide accurate and dependable prior information on static environments around vehicles. As one of the main high-level elements in HD maps, the road lane centerline is essential for downstream tasks such as autonomous navigation and planning. Considering the complex topology and significant overlap concerns of road centerlines, previous studies have rarely examined the centerline HD map mapping problem. Recent learningbased pipelines take heuristic post-processing predictions to generate a structured centerline output without instance information. To ameliorate this situation, we propose a novel, end-to-end road centerlines vectorized graph generation pipeline, termed CenterLineFormer. CenterLineFormer takes a single onboard camera image as input and predicts a directed graph representing the lane-layer map in the bird’s-eye view (BEV). We propose a strategy for better view transformation that uses a cross-attention mechanism to generate a dense BEV feature map. With our pipeline, we can describe the connection relationship between different centerlines and generate structured lane graphs for downstream modules as planning and control. Qualitatively, our experiments emphasize that our pipeline achieves a superior performance against previous baselines on nuScenes dataset. We also show that CenterLineFormer can generate accurate centerline graph topologies on night driving and complex traffic intersection scenes.

Cite this article

QIN Minghui, LIU Yuanzhi, L Na, TAO Wei, ZHAO Hui . CenterLineFormer: Road Centerlines Graph Generation with Single Onboard Camera[J]. Journal of Shanghai Jiaotong University(Science), 2025 , 30(5) : 1009 -1017 . DOI: 10.1007/s12204-024-2696-1

References

[1] SEIF H G, HU X L. Autonomous driving in the iCity—HD maps as a key challenge of the automotive industry [J]. Engineering, 2016, 2(2): 159-162.

[2] MA W C, URTASUN R, TARTAVULL I, et al. Exploiting sparse semantic HD maps for self-driving vehicle localization [C]//2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau: IEEE, 2019: 5304-5311.

[3]    CHEN D, ZHOU B, KOLTUN V, et al. Learning by Cheating[C]// 3rd Conference on Robot Learning. Osakan: PMLR, 2019: 66-75.

[4] CUI H G, RADOSAVLJEVIC V, CHOU F C, et al. Multimodal trajectory predictions for autonomous driving using deep convolutional networks [C]//2019 International Conference on Robotics and Automation. Montreal: IEEE, 2019: 2090-2096.

[5] HONG J, SAPP B, PHILBIN J. Rules of the road: Predicting driving behavior with a convolutional model of semantic interactions [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 8446-8454.

[6] BASTANI F, HE S T, ABBAR S, et al. RoadTracer: automatic extraction of road networks from aerial images [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4720-4728.

[7] HOMAYOUNFAR N, MA W C, LAKSHMIKANTH S K, et al. Hierarchical recurrent attention networks for structured online maps [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 3417-3426.

[8] XU Z H, SUN Y X, LIU M. Topo-boundary: A benchmark dataset on topological road-boundary detection using aerial images for autonomous driving [J]. IEEE Robotics and Automation Letters, 2021, 6(4): 7248-7255.

[9] LIANG J, HOMAYOUNFAR N, MA W C, et al. Convolutional recurrent network for road boundary extraction [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 9504-9513.

[10] XU Z H, SUN Y X, LIU M. iCurb: Imitation learning-based detection of road curbs using aerial images for autonomous driving [J]. IEEE Robotics and Automation Letters, 2021, 6(2): 1097-1104.

[11] RODDICK T, CIPOLLA R. Predicting semantic map representations from images using pyramid occupancy networks [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11135-11144.

[12] YANG B, LIANG M, URTASUN R. HDNET: Exploiting HD maps for 3D object detection [DB/OL]. (2020-12-21). https://arxiv.org/abs/2012.11704

[13] LI Q, WANG Y, WANG Y L, et al. HDMapNet: an online HD map construction and evaluation framework [C]//2022 International Conference on Robotics and Automation. Philadelphia: IEEE, 2022: 4628-4634.

[14] XU H Q, YANG M, DENG L Y, et al. Semantic segmentation-based road marking detection using around view monitoring system [J]. Journal of Shanghai Jiao Tong University (Science), 2022, 27(6): 833-843.

[15] CAN Y B, LINIGER A, PAUDEL D P, et al. Structured bird’s-eye-view traffic scene understanding from onboard images [C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 15641-15650.

[16] CAN Y B, LINIGER A, PAUDEL D P, et al. Topology preserving local road network estimation from single onboard camera image [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 17242-17251.

[17] CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[M]//European conference on computer vision. Cham: Springer, 2018: 833-851.

[18] CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 3213-3223.

[19] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 936-944.

[20] MALLOT H A, BÜLTHOFF H H, LITTLE J J, et al. Inverse perspective mapping simplifies optical flow computation and obstacle detection [J]. Biological Cybernetics, 1991, 64(3): 177-185.

[21]  ZHU X, SU W, LU L, et al. Deformable detr: Deformable transformers for end-to-end object detection[C]// 2021 7th International Conference on Learning Representations. Online: ICLR, 2021:1-16.

[22] CAESAR H, BANKITI V, LANG A H, et al. nuScenes: A multimodal dataset for autonomous driving [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11618-11628.

[23]  LOSHCHILOV I, HUTTER F. Decoupled Weight Decay Regularization[C]// 2019 7th International Conference on Learning Representations. New Orleans: ICLR, 2019:1-19.

[24] ACUNA D, LING H, KAR A, et al. Efficient interactive annotation of segmentation datasets with polygon-RNN [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 859-868.

[25] KO Y, LEE Y, AZAM S, et al. Key points estimation and point instance segmentation approach for lane detection [J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(7): 8949-8958.

Outlines

/