TransPose: Real-time 3D Human Translation and Pose Estimation with Six
Inertial Sensors
- URL: http://arxiv.org/abs/2105.04605v1
- Date: Mon, 10 May 2021 18:41:42 GMT
- Title: TransPose: Real-time 3D Human Translation and Pose Estimation with Six
Inertial Sensors
- Authors: Xinyu Yi, Yuxiao Zhou, Feng Xu
- Abstract summary: We present TransPose, a DNN-based approach to perform full motion capture from only 6 Inertial Measurement Units (IMUs) at over 90 fps.
For body pose estimation, we propose a multi-stage network that estimates leaf-to-full joint positions as intermediate results.
For global translation estimation, we propose a supporting-foot-based method and an RNN-based method to robustly solve for the global translations.
- Score: 7.565581566766422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motion capture is facing some new possibilities brought by the inertial
sensing technologies which do not suffer from occlusion or wide-range
recordings as vision-based solutions do. However, as the recorded signals are
sparse and quite noisy, online performance and global translation estimation
turn out to be two key difficulties. In this paper, we present TransPose, a
DNN-based approach to perform full motion capture (with both global
translations and body poses) from only 6 Inertial Measurement Units (IMUs) at
over 90 fps. For body pose estimation, we propose a multi-stage network that
estimates leaf-to-full joint positions as intermediate results. This design
makes the pose estimation much easier, and thus achieves both better accuracy
and lower computation cost. For global translation estimation, we propose a
supporting-foot-based method and an RNN-based method to robustly solve for the
global translations with a confidence-based fusion technique. Quantitative and
qualitative comparisons show that our method outperforms the state-of-the-art
learning- and optimization-based methods with a large margin in both accuracy
and efficiency. As a purely inertial sensor-based approach, our method is not
limited by environmental settings (e.g., fixed cameras), making the capture
free from common difficulties such as wide-range motion space and strong
occlusion.
Related papers
- ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks.
We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation.
Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z) - SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning [17.99904937160487]
We introduce SCIPaD, a novel approach that incorporates spatial clues for unsupervised depth-pose joint learning.
SCIPaD achieves a reduction of 22.2% in average translation error and 34.8% in average angular error for camera pose estimation task on the KITTI Odometry dataset.
arXiv Detail & Related papers (2024-07-07T06:52:51Z) - VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques.
We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step.
Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z) - FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation [30.710296843150832]
Estimating relative camera poses between images has been a central problem in computer vision.
We show how to combine the best of both methods; our approach yields results that are both precise and robust.
A comprehensive analysis supports our design choices and demonstrates that our method adapts flexibly to various feature extractors and correspondence estimators.
arXiv Detail & Related papers (2024-03-05T18:59:51Z) - Match and Locate: low-frequency monocular odometry based on deep feature
matching [0.65268245109828]
We introduce a novel approach for the robotic odometry which only requires a single camera.
The approach is based on matching image features between the consecutive frames of the video stream using deep feature matching models.
We evaluate the performance of the approach in the AISG-SLA Visual Localisation Challenge and find that while being computationally efficient and easy to implement our method shows competitive results.
arXiv Detail & Related papers (2023-11-16T17:32:58Z) - View Consistent Purification for Accurate Cross-View Localization [59.48131378244399]
This paper proposes a fine-grained self-localization method for outdoor robotics.
The proposed method addresses limitations in existing cross-view localization methods.
It is the first sparse visual-only method that enhances perception in dynamic environments.
arXiv Detail & Related papers (2023-08-16T02:51:52Z) - Occlusion-Robust Object Pose Estimation with Holistic Representation [42.27081423489484]
State-of-the-art (SOTA) object pose estimators take a two-stage approach.
We develop a novel occlude-and-blackout batch augmentation technique.
We also develop a multi-precision supervision architecture to encourage holistic pose representation learning.
arXiv Detail & Related papers (2021-10-22T08:00:26Z) - Improving Robustness and Accuracy via Relative Information Encoding in
3D Human Pose Estimation [59.94032196768748]
We propose a relative information encoding method that yields positional and temporal enhanced representations.
Our method outperforms state-of-the-art methods on two public datasets.
arXiv Detail & Related papers (2021-07-29T14:12:19Z) - Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z) - Learning to Estimate Hidden Motions with Global Motion Aggregation [71.12650817490318]
Occlusions pose a significant challenge to optical flow algorithms that rely on local evidences.
We introduce a global motion aggregation module to find long-range dependencies between pixels in the first image.
We demonstrate that the optical flow estimates in the occluded regions can be significantly improved without damaging the performance in non-occluded regions.
arXiv Detail & Related papers (2021-04-06T10:32:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.