J Shanghai Jiaotong Univ Sci ›› 2023, Vol. 28 ›› Issue (1): 20-27.doi: 10.1007/s12204-023-2565-3

• Intelligent Transportation Systems • Previous Articles     Next Articles

Action-aware Encoder-Decoder Network for Pedestrian Trajectory Prediction


FU Jiawei∗ (傅家威), ZHAO Xu (赵 旭)   

  1. (Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China)
  2. (上海交通大学 电子信息与电气工程学院 自动化系,上海200240)
  • Received:2022-02-28 Online:2023-01-28 Published:2023-02-10

Abstract: Accurate pedestrian trajectory predictions are critical in self-driving systems, as they are fundamental to the response- and decision-making of ego vehicles. In this study, we focus on the problem of predicting the future trajectory of pedestrians from a first-person perspective. Most existing trajectory prediction methods from the first-person view copy the bird’s-eye view, neglecting the differences between the two. To this end, we clarify the differences between the two views and highlight the importance of action-aware trajectory prediction in the first-person view. We propose a new action-aware network based on an encoder-decoder framework with an action prediction and a goal estimation branch at the end of the encoder. In the decoder part, bidirectional long short-term memory (Bi-LSTM) blocks are adopted to generate the ultimate prediction of pedestrians’ future trajectories. Our method was evaluated on a public dataset and achieved a competitive performance, compared with other approaches. An ablation study demonstrates the effectiveness of the action prediction branch.

Key words: pedestrian trajectory prediction, first-person view, action prediction, encoder-decoder, bidirectional long short-term memory (Bi-LSTM)

摘要: 准确的行人轨迹预测在自动驾驶系统中至关重要,因为它们对于自主车辆的响应和决策至关重要。在本研究中,我们关注从第一人称视角预测行人未来轨迹的问题。大多数现有的第一人称视角的轨迹预测方法采用了鸟瞰图下的预测方法,忽略了两者之间的差异。为此,我们澄清了两种视角之间的差异,并强调了第一人称视角中动作感知对于轨迹预测的重要性。我们提出了一种基于编码器–解码器框架的新动作感知网络,在编码器末端具有动作预测分支和目标估计分支。在解码器部分,采用双向长短期记忆块来生成行人未来轨迹的最终预测。与其他方法相比,我们的方法在公共数据集上进行了评估,并取得了有竞争力的表现。消融研究证明了动作预测分支的有效性。

关键词: 行人轨迹预测,第一人称视角,动作预测,编码器–解码器,双向长短期记忆网络

CLC Number: