Related papers: TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors

TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors

URL: http://arxiv.org/abs/2105.04605v1
Date: Mon, 10 May 2021 18:41:42 GMT
Title: TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors
Authors: Xinyu Yi, Yuxiao Zhou, Feng Xu
Abstract summary: We present TransPose, a DNN-based approach to perform full motion capture from only 6 Inertial Measurement Units (IMUs) at over 90 fps. For body pose estimation, we propose a multi-stage network that estimates leaf-to-full joint positions as intermediate results. For global translation estimation, we propose a supporting-foot-based method and an RNN-based method to robustly solve for the global translations.
Score: 7.565581566766422
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Motion capture is facing some new possibilities brought by the inertial sensing technologies which do not suffer from occlusion or wide-range recordings as vision-based solutions do. However, as the recorded signals are sparse and quite noisy, online performance and global translation estimation turn out to be two key difficulties. In this paper, we present TransPose, a DNN-based approach to perform full motion capture (with both global translations and body poses) from only 6 Inertial Measurement Units (IMUs) at over 90 fps. For body pose estimation, we propose a multi-stage network that estimates leaf-to-full joint positions as intermediate results. This design makes the pose estimation much easier, and thus achieves both better accuracy and lower computation cost. For global translation estimation, we propose a supporting-foot-based method and an RNN-based method to robustly solve for the global translations with a confidence-based fusion technique. Quantitative and qualitative comparisons show that our method outperforms the state-of-the-art learning- and optimization-based methods with a large margin in both accuracy and efficiency. As a purely inertial sensor-based approach, our method is not limited by environmental settings (e.g., fixed cameras), making the capture free from common difficulties such as wide-range motion space and strong occlusion.

Related papers

Progressive Inertial Poser: Progressive Real-Time Kinematic Chain Estimation for 3D Full-Body Pose from Three IMU Sensors [25.67875816218477]
Full-body pose estimation from sparse tracking signals is not limited by environmental conditions or recording range.<n>Previous works either face the challenge of wearing additional sensors on the pelvis and lower-body or rely on external visual sensors to obtain global positions of key joints.<n>To improve the practicality of the technology for virtual reality applications, we estimate full-body poses using only inertial data obtained from three Inertial Measurement Unit (IMU) sensors worn on the head and wrists.
arXiv Detail & Related papers (2025-05-08T15:28:09Z)
Multi-Grained Feature Pruning for Video-Based Human Pose Estimation [19.297490509277463]
We propose a novel multi-scale resolution framework for human pose estimation. We employ a density clustering method to identify tokens that offer important semantic information. Our method achieves a 93.8% improvement in inference speed compared to the baseline.
arXiv Detail & Related papers (2025-03-07T12:14:51Z)
ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks. We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation. Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z)
SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning [17.99904937160487]
We introduce SCIPaD, a novel approach that incorporates spatial clues for unsupervised depth-pose joint learning. SCIPaD achieves a reduction of 22.2% in average translation error and 34.8% in average angular error for camera pose estimation task on the KITTI Odometry dataset.
arXiv Detail & Related papers (2024-07-07T06:52:51Z)
VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques. We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step. Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z)
FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation [30.710296843150832]
Estimating relative camera poses between images has been a central problem in computer vision. We show how to combine the best of both methods; our approach yields results that are both precise and robust. A comprehensive analysis supports our design choices and demonstrates that our method adapts flexibly to various feature extractors and correspondence estimators.
arXiv Detail & Related papers (2024-03-05T18:59:51Z)
Match and Locate: low-frequency monocular odometry based on deep feature matching [0.65268245109828]
We introduce a novel approach for the robotic odometry which only requires a single camera. The approach is based on matching image features between the consecutive frames of the video stream using deep feature matching models. We evaluate the performance of the approach in the AISG-SLA Visual Localisation Challenge and find that while being computationally efficient and easy to implement our method shows competitive results.
arXiv Detail & Related papers (2023-11-16T17:32:58Z)
View Consistent Purification for Accurate Cross-View Localization [59.48131378244399]
This paper proposes a fine-grained self-localization method for outdoor robotics. The proposed method addresses limitations in existing cross-view localization methods. It is the first sparse visual-only method that enhances perception in dynamic environments.
arXiv Detail & Related papers (2023-08-16T02:51:52Z)
Occlusion-Robust Object Pose Estimation with Holistic Representation [42.27081423489484]
State-of-the-art (SOTA) object pose estimators take a two-stage approach. We develop a novel occlude-and-blackout batch augmentation technique. We also develop a multi-precision supervision architecture to encourage holistic pose representation learning.
arXiv Detail & Related papers (2021-10-22T08:00:26Z)
Improving Robustness and Accuracy via Relative Information Encoding in 3D Human Pose Estimation [59.94032196768748]
We propose a relative information encoding method that yields positional and temporal enhanced representations. Our method outperforms state-of-the-art methods on two public datasets.
arXiv Detail & Related papers (2021-07-29T14:12:19Z)
Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z)
Learning to Estimate Hidden Motions with Global Motion Aggregation [71.12650817490318]
Occlusions pose a significant challenge to optical flow algorithms that rely on local evidences. We introduce a global motion aggregation module to find long-range dependencies between pixels in the first image. We demonstrate that the optical flow estimates in the occluded regions can be significantly improved without damaging the performance in non-occluded regions.
arXiv Detail & Related papers (2021-04-06T10:32:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.