GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with 2D-3D Multi-Feature Learning

Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani

Robotics Institute, Carnegie Mellon University

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020


One-Sentence Summary

We proposed the first 3D multi-object tracking method that leverages Graph Neural Network for object interaction modeling.


Demo Video (1 minute short presentation at CVPR 2020)


Demo Video (5 minute spotlight presentation at ECCVW 2020)


Abstract

3D Multi-object tracking (MOT) is crucial to autonomous systems. Recent work uses a standard tracking-by-detection pipeline, where feature extraction is first performed independently for each object in order to compute an affinity matrix. Then the affinity matrix is passed to the Hungarian algorithm for data association. A key process of this standard pipeline is to learn discriminative features for different objects in order to reduce confusion during data association. In this work, we propose two techniques to improve the discriminative feature learning for MOT: (1) instead of obtaining features for each object independently, we propose a novel feature interaction mechanism by introducing the Graph Neural Network. As a result, the feature of one object is informed of the features of other objects so that the object feature can lean towards the object with similar feature (i.e., object probably with a same ID) and deviate from objects with dissimilar features (i.e., object probably with different IDs), leading to a more discriminative feature for each object; (2) instead of obtaining the feature from either 2D or 3D space in prior work, we propose a novel joint feature extractor to learn appearance and motion features from 2D and 3D space simultaneously. As features from different modalities often have complementary information, the joint feature can be more discriminate than feature from each individual modality. To ensure that the joint feature extractor does not heavily rely on one modality, we also propose an ensemble training paradigm. Through extensive evaluation, our proposed method achieves state-of-the-art performance on KITTI and nuScenes 3D MOT benchmarks.


Approach




BibTex

@article{Weng2020_gnn3dmot, 
author = {Weng, Xinshuo and Wang, Yongxin and Man, Yunze and Kitani, Kris}, 
journal = {CVPR}, 
title = {{GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with 2D-3D Multi-Feature Learning}}, 
year = {2020} 
}
@article{Weng2020_GNN3DMOT_eccvw, 
author = {Weng, Xinshuo and Wang, Yongxin and Man, Yunze and Kitani, Kris}, 
journal = {ECCVW}, 
title = {{Graph Neural Network for 3D Multi-Object Tracking}}, 
year = {2020} 
}

Page Views since 08/18/2020