Related papers: Joint angle model based learning to refine kinematic human pose estimation

Joint angle model based learning to refine kinematic human pose estimation

URL: http://arxiv.org/abs/2507.11075v1
Date: Tue, 15 Jul 2025 08:16:39 GMT
Title: Joint angle model based learning to refine kinematic human pose estimation
Authors: Chang Peng, Yifei Zhou, Huifeng Xi, Shiqing Huang, Chuangye Chen, Jianming Yang, Bao Yang, Zhenyu Jiang,
Abstract summary: Current human pose estimation (HPE) suffers from occasional errors in keypoint recognition and random fluctuation in keypoint trajectories.<n>This paper proposed a method to overcome the difficulty through joint angle-based modeling.<n>A bidirectional recurrent network is designed as a post-processing module to refine the estimation of well-established HRNet.
Score: 8.6527127612359
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Marker-free human pose estimation (HPE) has found increasing applications in various fields. Current HPE suffers from occasional errors in keypoint recognition and random fluctuation in keypoint trajectories when analyzing kinematic human poses. The performance of existing deep learning-based models for HPE refinement is considerably limited by inaccurate training datasets in which the keypoints are manually annotated. This paper proposed a novel method to overcome the difficulty through joint angle-based modeling. The key techniques include: (i) A joint angle-based model of human pose, which is robust to describe kinematic human poses; (ii) Approximating temporal variation of joint angles through high order Fourier series to get reliable "ground truth"; (iii) A bidirectional recurrent network is designed as a post-processing module to refine the estimation of well-established HRNet. Trained with the high-quality dataset constructed using our method, the network demonstrates outstanding performance to correct wrongly recognized joints and smooth their spatiotemporal trajectories. Tests show that joint angle-based refinement (JAR) outperforms the state-of-the-art HPE refinement network in challenging cases like figure skating and breaking.

Related papers

NLML-HPE: Head Pose Estimation with Limited Data via Manifold Learning [0.8716913598251385]
Head pose estimation (HPE) plays a critical role in various computer vision applications such as human-computer interaction and facial recognition.<n>We propose a novel deep learning approach for head pose estimation with limited training data via non-linear manifold learning.<n>We achieve real-time performance with limited training data as our method accurately captures the nature of rotation of an object from facial landmarks.
arXiv Detail & Related papers (2025-07-24T14:08:33Z)
Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment [49.36799270585947]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference. We propose a novel contrastive pre-training framework tailored for PCQA (CoPA) Our method outperforms the state-of-the-art PCQA methods on popular benchmarks.
arXiv Detail & Related papers (2024-03-15T07:16:07Z)
FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models [59.13757801286343]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.<n>We introduce the FILP-3D framework with two novel components: the Redundant Feature Eliminator (RFE) for feature space misalignment and the Spatial Noise Compensator (SNC) for significant noise.
arXiv Detail & Related papers (2023-12-28T14:52:07Z)
Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction [22.894810893732416]
The paper proposes a bidirectional correspondence prediction network with a point-wise attention-aware mechanism. Our key insight is that the correlations between each model point and scene point provide essential information for learning point-pair matches. Experimental results on the public datasets of LineMOD, YCB-Video, and Occ-LineMOD show that the proposed method achieves better performance than other state-of-the-art methods.
arXiv Detail & Related papers (2023-08-16T17:13:45Z)
Discover, Explanation, Improvement: An Automatic Slice Detection Framework for Natural Language Processing [72.14557106085284]
slice detection models (SDM) automatically identify underperforming groups of datapoints. This paper proposes a benchmark named "Discover, Explain, improve (DEIM)" for classification NLP tasks. Our evaluation shows that Edisa can accurately select error-prone datapoints with informative semantic features.
arXiv Detail & Related papers (2022-11-08T19:00:00Z)
A Model for Multi-View Residual Covariances based on Perspective Deformation [88.21738020902411]
We derive a model for the covariance of the visual residuals in multi-view SfM, odometry and SLAM setups. We validate our model with synthetic and real data and integrate it into photometric and feature-based Bundle Adjustment.
arXiv Detail & Related papers (2022-02-01T21:21:56Z)
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation [79.78017059539526]
We propose a new heatmap-free keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchor-based detection framework. In experiments, we observe that KAPAO is significantly faster and more accurate than previous methods, which suffer greatly from heatmap post-processing. Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft COCO Keypoints validation set without test-time augmentation.
arXiv Detail & Related papers (2021-11-16T15:36:44Z)
Revisit Geophysical Imaging in A New View of Physics-informed Generative Adversarial Learning [2.12121796606941]
Full waveform inversion produces high-resolution subsurface models. FWI with least-squares function suffers from many drawbacks such as the local-minima problem. Recent works relying on partial differential equations and neural networks show promising performance for two-dimensional FWI. We propose an unsupervised learning paradigm that integrates wave equation with a discriminate network to accurately estimate the physically consistent models.
arXiv Detail & Related papers (2021-09-23T15:54:40Z)
Real-time Pose and Shape Reconstruction of Two Interacting Hands With a Single Depth Camera [79.41374930171469]
We present a novel method for real-time pose and shape reconstruction of two strongly interacting hands. Our approach combines an extensive list of favorable properties, namely it is marker-less. We show state-of-the-art results in scenes that exceed the complexity level demonstrated by previous work.
arXiv Detail & Related papers (2021-06-15T11:39:49Z)
Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration [67.69257782645789]
We propose piecewise transformation fields that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space. We show that fitting parametric models with poses by our network results in much better registration quality, especially for extreme poses.
arXiv Detail & Related papers (2021-04-16T15:16:09Z)
SIMPLE: SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation [81.03485688525133]
We propose a novel multi-person pose estimation framework, SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation (SIMPLE) Specifically, in the training process, we enable SIMPLE to mimic the pose knowledge from the high-performance top-down pipeline. Besides, SIMPLE formulates human detection and pose estimation as a unified point learning framework to complement each other in single-network.
arXiv Detail & Related papers (2021-04-06T13:12:51Z)
Self-Regression Learning for Blind Hyperspectral Image Fusion Without Label [11.291055330647977]
We propose a self-regression learning method that reconstructs hyperspectral image (HSI) and estimate the observation model. In particular, we adopt an invertible neural network (INN) for restoring the HSI, and two fully-connected networks (FCN) for estimating the observation model. Our model can outperform the state-of-the-art methods in experiments on both synthetic and real-world dataset.
arXiv Detail & Related papers (2021-03-31T04:48:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.