J Shanghai Jiaotong Univ Sci ›› 2026, Vol. 31 ›› Issue (1): 117-129.doi: 10.1007/s12204-025-2841-5

• Intelligent Robots • Previous Articles     Next Articles

SDA-Loc: A Semantic-Driven Alignment Algorithm for Cross-Modal Localization in Point Cloud Maps

SDA-Loc:一种用于点云地图中跨模态定位的语义驱动对齐算法

曾宇烜1,2,3,赵文韬1,2,3,陈永涛1,2,3,肖鹏4,王景川1,2,3,郭锐4   

  1. 1. Department of Automation; Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai 200030, China; 2. Key Laboratory of System Control and Information Processing of Ministry of Education, Shanghai 200030, China; 3. Shanghai Engineering Research Center of Intelligent Control and Management, Shanghai 200030, China; 4. State Grid Intelligence Technology Co., Ltd., Jinan 250101, China
  2. 1. 上海交通大学 自动化系;医疗机器人研究院,上海 200030;2. 系统控制与信息处理教育部重点实验室,上海 200030;3. 上海工业智能管控工程技术研究中心,上海 200030;4. 国网智能科技股份有限公司,济南 250101
  • Received:2024-11-26 Revised:2025-01-23 Accepted:2025-02-17 Online:2026-02-28 Published:2025-08-26

Abstract: Cross-modal localization, utilizing only cameras and prior light detection and ranging (LiDAR) point cloud maps, achieves high localization accuracy at a low cost. The integration of semantic information can significantly enhance the accuracy at the cost of heavy computational load on optimization and huge semantic annotation on LiDAR point cloud maps. In this paper, we propose the SDA-Loc, a semantic cross-modal localization system that solely relies on visual semantic information, making our approach more streamlined compared to existing methods. We design a semantic-driven alignment algorithm that leverages visual semantic labels to perform different types of iterative closest point, allowing the system to better exploit the structural information represented by object semantics, thereby achieving accurate localization without the additional burden of point cloud annotation. Coupled with a designed dynamic error rejection mechanism, our approach effectively achieves a balance between accuracy and speed. The experiments conducted on the KITTI dataset demonstrate the competitive localization performance of our approach. Moreover, the experiment on outdoor campus dataset confirms that the proposed system can effectively mitigate the drift in visual localization under challenging lighting conditions, and proves the robustness of SDA-Loc when using poor LiDAR point cloud maps. The runtime analysis also shows that SDA-Loc strikes an excellent balance between localization accuracy and computational efficiency.

Key words: cross-modal localization, map-based localization, semantic-driven alignment algorithm, simultaneous localization and mapping

摘要: 仅依赖相机在先验激光(LiDAR)点云地图中进行定位的跨模态视觉定位方法,能够在低成本约束下实现高精度定位。现有研究表明,融合语义信息可有效提升定位精度,但此类方法通常面临激光点云地图大规模语义标注所带来的开销以及位姿优化过程中的计算负担。为解决这一问题,本文提出了一种轻量化的语义跨模态定位系统SDA-Loc,该方法仅依赖视觉语义信息,较现有方法更为简洁高效。我们设计了语义驱动对齐算法,利用视觉语义标签引导改进型迭代最近点配准,充分挖掘目标语义表征的结构化特征,在无需点云地图附加标注的前提下,实现了精准的位姿估计。同时,结合动态错误拒绝机制,本方法有效平衡了定位精度与实时性需求。基于KITTI数据集的实验结果表明,SDA-Loc在定位精度上具备与现有方法竞争的优势。进一步,针对户外校园场景数据集的测试结果显示,在复杂光照变化等环境下,系统能够有效降低传统视觉定位方法中常见的漂移问题。此外,在点云地图质量较差的情况下,系统依然能够保持较强的鲁棒性。运行时间分析表明,SDA-Loc在定位精度与计算效率之间达到了良好的平衡。

关键词: 跨模态定位,基于地图的定位方法,语义驱动对齐算法,同时定位和映射

CLC Number: