A Benchmark Dataset for 6DoF Object Pose Tracking
Abstract
Accurately tracking the six degree-of-freedom pose of an object in real scenes is an important task in computer vision and augmented reality with numerous applications. Although a variety of algorithms for this task have been proposed, it remains difficult to evaluate existing methods in the literature as oftentimes different sequences are used and no large benchmark datasets close to real-world scenarios are available. In this paper, we present a large object pose tracking benchmark dataset consisting of RGB-D video sequences of 2D and 3D targets with ground-truth information. The videos are recorded under various lighting conditions, different motion patterns and speeds with the help of a programmable robotic arm. We present extensive quantitative evaluation results of the state-of-the-art methods on this benchmark dataset and discuss the potential research directions in this field.
Paper
Paper (7.58 MB)
Supplementary Material (67.1 MB)
Poster (2.50 MB)
Intro
Citation
Po-Chen Wu, Yueh-Ying Lee, Hung-Yu Tseng, Hsuan-I Ho, Ming-Hsuan Yang, and Shao-Yi Chien, "A Benchmark Dataset for 6DoF Object Pose Tracking." In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality (ISMAR Adjunct), 2017.
Bibtex
@inproceeding{OPT2017,
author = {Wu, Po-Chen and Lee, Yueh-Ying and Tseng, Hung-Yu and Ho, Hsuan-I and Yang, Ming-Hsuan and Chien, Shao-Yi},
title = {A Benchmark Dataset for 6DoF Object Pose Tracking},
booktitle = {IEEE International Symposium on Mixed and Augmented Reality (ISMAR Adjunct)},
year = {2017}
}
Notes
- (2017/11/07) The runtimes of the IPPE method (0.044s → 0.001s) and the OPnP method (0.156s → 0.008s) are corrected.
Download
Notes
-
This dataset can also be downloaded from FTP:
Host Port Username Password 140.112.48.121 25253 opt dataset - You can check the file name and file size by moving your mouse over the corresponding download icon.
- It contains color, depth, and mask lossless PNG image sequences for both 2D and 3D models.
- All images are rectified according to their distortion coefficients (radial and tangential distortions).
-
The transformation matrix between depth camera coordinate system and color camera coordinate system is shown below.
-
We provide the pose viewer software (written in MATLAB language) for checking poses. The GUI is shown below (1080p case).
-
The folder structure is shown below (1080p case).
-
The coordinate system is shown below.
-
The evaluated motion patterns are shown below.
Results
References
- Lowe, David G, "Distinctive image features from scale-invariant keypoints." IJCV, 2004.
- Yu, Guoshen, and Jean-Michel Morel, "ASIFT: An algorithm for fully affine invariant comparison." IPOL, 2011.
- Zheng, Yinqiang, et al., "Revisiting the pnp problem: A fast, general and optimal solution." ICCV, 2013.
- Collins, Toby, and Adrien Bartoli, "Infinitesimal plane-based pose estimation." IJCV, 2014.
- Tseng, Hung-Yu, et al., "Direct 3D pose estimation of a planar target." WACV, 2016.
- Brachmann, Eric, et al., "Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image." CVPR, 2016.
- Prisacariu, Victor A., and Ian D. Reid, "PWP3D: Real-time segmentation and tracking of 3D objects." IJCV, 2012.
- Raúl Mur-Artal and Juan D. Tardós, "ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras." IEEE T-RO, 2017.
- Whelan, Thomas, et al., "ElasticFusion: Dense SLAM without a pose graph." RSS, 2015.
Acknowledgement
The authors wish to thank Professor Shih-Chung Kang and Ci-Jyun Liang from RLab, NTUCE for providing their programmable robotic arm and the fruitful discussions. We would also like to show our gratitude to Po-Hao Hsu for sharing his photos used in this work.