Pedestrian Intention Prediction: A Multi-task Perspective

Absract:

In order to be globally deployed, autonomous cars must guarantee the safety of pedestrians. This is the reason why forecasting pedestrians' intentions sufficiently in advance is one of the most critical and challenging tasks for autonomous vehicles. This work tries to solve this problem by jointly predicting the intention and visual states of pedestrians. In terms of visual states, whereas previous work focused on x-y coordinates, we will also predict the size and indeed the whole bounding box of the pedestrian. The method is a recurrent neural network in a multi-task learning approach. It has one head that predicts the intention of the pedestrian for each one of its future position and another one predicting the visual states of the pedestrian. Experiments on the JAAD dataset show the superiority of the performance of our method compared to previous works for intention prediction. Also, although its simple architecture (more than 2 times faster), the performance of the bounding box prediction is comparable to the ones yielded by much more complex architectures.

Introduction:

This is the official code for the paper "Pedestrian Intention Prediction: A Multi-task Perspective", accepted and published in hEART 2021 (the 9th Symposium of the European Association for Research in Transportation).

Repository Structure
Proposed Method
Results
Installation
Dataset
Training/Testing
Tested Environments

Repository structure:

├── bounding-box-prediction         : Project repository
        ├── prepare_data.py         : Script for processing raw JAAD data.
        ├── train.py                : Script for training PV-LSTM.  
        ├── test.py                 : Script for testing PV-LSTM.  
        ├── DataLoader.py           : Script for data pre-processing and loader. 
        ├── networks.py             : Script containing the implementation of the network.
        ├── utils.py                : Script containing necessary math and transformation functions.

Proposed method

Results

Installation:

Start by cloning this repositiory:

git clone https://github.com/vita-epfl/bounding-box-prediction.git
cd bounding-box-prediction

Create a new conda environment (Python 3.7):

conda create -n pv-lstm python=3.7
conda activate pv-lstm

And install the dependencies:

pip install -r requirements.txt

Dataset:

Clone the dataset's repository.

git clone https://github.com/ykotseruba/JAAD

Run the prepare_data.py script, make sure you provide the path to the JAAD repository and the train/val/test ratios (ratios must be in [0,1] and their sum should equal 1.

python3 prepare_data.py |path/to/JAAD/repo| |train_ratio| |val_ratio| |test_ratio|

Download the JAAD clips (UNRESIZED) and unzip them in the videos folder.
Run the script split_clips_to_frames.sh to convert the JAAD videos into frames. Each frame will be placed in a folder under the scene folder. Note that this takes 169G of space.

Training/Testing:

Open train.py and test.py and change the parameters in the args class depending on the paths of your files. Start training the network by running the command:

python3 train.py

Test the trained network by running the command:

python3 test.py

Tested Environments:

Ubuntu 18.04, CUDA 10.1
Windows 10, CUDA 10.1

Citation

@inproceedings{bouhsain2020pedestrian,
title={Pedestrian Intention Prediction: A Multi-task Perspective},
 author={Bouhsain, Smail and Saadatnejad, Saeed and Alahi, Alexandre},
  booktitle = {European Association for Research in Transportation  (hEART)},
  year={2020},
}

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

vita-epfl / bounding-box-prediction

Programming Languages