Kohei Aso, Dong-Hyun Hwang, and Hideki Koike
Tokyo Institute of Technology
We propose a system that estimates the 3D body pose of other parties using a single RGB chest-mounted ultra-wide fisheye camera. Although the fisheye camera can capture a wide field of view, it is difficult to apply image processing for perspective images because of its strong distortion. In our method, the input fisheye image is converted to an equirectangular image to detect another person and their 2D keypoints, and then convert them to a 3D pose. In order to adapt to the distortion of equirectangular images, we generate a synthetic dataset and fine-tune the model. We also estimate the location of the other person so that we can reconstruct the absolute camera-centered global pose. We evaluate the accuracy on real-world data and show that the fine-tuned model performs best.