上海交通大学学报 ›› 2024, Vol. 58 ›› Issue (11): 1826-1834.doi: 10.16183/j.cnki.jsjtu.2024.239

• 制导、导航与控制 • 上一篇    

基于双流特征提取的车路协同感知方法

牛国臣(), 孙翔宇, 苑峥岩   

  1. 中国民航大学 机器人研究所,天津 300300
  • 收稿日期:2024-06-21 修回日期:2024-07-16 接受日期:2024-07-18 出版日期:2024-11-28 发布日期:2024-12-02
  • 作者简介:牛国臣(1981—),副教授,从事智能机器人环境感知研究;E-mail:niu_guochen@139.com.
  • 基金资助:
    国家自然科学基金(U2333205);中央高校基本科研业务费(3122023PY04)

Vehicle-Road Collaborative Perception Method Based on Dual-Stream Feature Extraction

NIU Guochen(), SUN Xiangyu, YUAN Zhengyan   

  1. Robotics Institute, Civil Aviation University of China, Tianjin 300300, China
  • Received:2024-06-21 Revised:2024-07-16 Accepted:2024-07-18 Online:2024-11-28 Published:2024-12-02

摘要:

针对自动驾驶在遮挡、超视距场景下感知不充分的问题,提出一种基于双流特征提取网络的特征级车路协同感知方法,以增强交通参与者的3D目标检测能力.根据路端与车端场景特点分别设计对应的特征提取网络:路端具有丰富且充足的感知数据和计算资源,采用Transformer结构提取更丰富、高级的特征表示;车端计算能力有限、实时性需求高,利用部分卷积(PConv)提高计算效率,引入Mamba-VSS模块实现对复杂环境的高效感知.通过置信度图指导关键感知信息共享与融合,有效实现了车路双端的协同感知.在DAIR-V2X数据集训练与测试,得到车端特征提取网络模型大小为8.1 MB,IoU阈值为0.5、0.7时对应平均精度指标为67.67%、53.74%.实验验证了该方法在检测精度、模型规模方面具备的优势,为车路协同提供了一种较低配置的检测方案.

关键词: 自动驾驶, 协同感知, 特征提取, 3D目标检测, 信息共享与融合

Abstract:

To solve the problem of inadequate perception of autonomous driving in occlusion and over-the-horizon scenarios, a vehicle-road collaborative perception method based on a dual-stream feature extraction network is proposed to enhance the 3D object detection capabilities of traffic participants. Feature extraction networks for roadside and vehicle-side scenes are tailored based on respective characteristics. Since roadside has rich and sufficient sensing data and computational resources, the Transformer structure is used to extract more sophisticated and advanced feature representations. Due to limited computational capability and high real-time demands of autonomous vehicles, partial convolution (PConv) is employed to enhance computing efficiency, and the Mamba-VSS module is introduced for efficient perception in complex environments. Collaborative perception between vehicle-side and roadside is accomplished through the selective sharing and fusion of critical perceptual information guided by confidence maps. By training and testing on DAIR-V2X dataset, the model size of vehicle-side feature extraction network is obtained to be 8.1 MB, and the IoU thresholds of 0.5 and 0.7 correspond to the average accuracy indexes of 67.67% and 53.74%. The experiment verifies the advantages of this method in detection accuracy and model size, and provides a lower-configuration detection scheme for vehicle-road collaboration.

Key words: autonomous driving, collaborative perception, feature extraction, 3D object detection, information sharing and fusion

中图分类号: