Adaptive Graphical Model Network for 2D Handpose Estimation
- URL: http://arxiv.org/abs/1909.08205v2
- Date: Fri, 28 Apr 2023 04:08:29 GMT
- Title: Adaptive Graphical Model Network for 2D Handpose Estimation
- Authors: Deying Kong, Yifei Chen, Haoyu Ma, Xiangyi Yan, Xiaohui Xie
- Abstract summary: We propose a new architecture to tackle the task of 2D hand pose estimation from a monocular RGB image.
The Adaptive Graphical Model Network (AGMN) consists of two branches of deep convolutional neural networks for calculating unary and pairwise potential functions.
Our approach outperforms the state-of-the-art method used in 2D hand keypoints estimation by a notable margin on two public datasets.
- Score: 19.592024471753025
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we propose a new architecture called Adaptive Graphical Model
Network (AGMN) to tackle the task of 2D hand pose estimation from a monocular
RGB image. The AGMN consists of two branches of deep convolutional neural
networks for calculating unary and pairwise potential functions, followed by a
graphical model inference module for integrating unary and pairwise potentials.
Unlike existing architectures proposed to combine DCNNs with graphical models,
our AGMN is novel in that the parameters of its graphical model are conditioned
on and fully adaptive to individual input images. Experiments show that our
approach outperforms the state-of-the-art method used in 2D hand keypoints
estimation by a notable margin on two public datasets. Code can be found at
https://github.com/deyingk/agmn.
Related papers
- Scalable Weibull Graph Attention Autoencoder for Modeling Document Networks [50.42343781348247]
We develop a graph Poisson factor analysis (GPFA) which provides analytic conditional posteriors to improve the inference accuracy.
We also extend GPFA to a multi-stochastic-layer version named graph Poisson gamma belief network (GPGBN) to capture the hierarchical document relationships at multiple semantic levels.
Our models can extract high-quality hierarchical latent document representations and achieve promising performance on various graph analytic tasks.
arXiv Detail & Related papers (2024-10-13T02:22:14Z) - DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features [65.8738034806085]
DistillNeRF is a self-supervised learning framework for understanding 3D environments in autonomous driving scenes.
Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs.
arXiv Detail & Related papers (2024-06-17T21:15:13Z) - Iterative Graph Filtering Network for 3D Human Pose Estimation [5.177947445379688]
Graph convolutional networks (GCNs) have proven to be an effective approach for 3D human pose estimation.
In this paper, we introduce an iterative graph filtering framework for 3D human pose estimation.
Our approach builds upon the idea of iteratively solving graph filtering with Laplacian regularization.
arXiv Detail & Related papers (2023-07-29T20:46:44Z) - Interpretable 2D Vision Models for 3D Medical Images [47.75089895500738]
This study proposes a simple approach of adapting 2D networks with an intermediate feature representation for processing 3D images.
We show on all 3D MedMNIST datasets as benchmark and two real-world datasets consisting of several hundred high-resolution CT or MRI scans that our approach performs on par with existing methods.
arXiv Detail & Related papers (2023-07-13T08:27:09Z) - Deep Graph Reprogramming [112.34663053130073]
"Deep graph reprogramming" is a model reusing task tailored for graph neural networks (GNNs)
We propose an innovative Data Reprogramming paradigm alongside a Model Reprogramming paradigm.
arXiv Detail & Related papers (2023-04-28T02:04:29Z) - K-Order Graph-oriented Transformer with GraAttention for 3D Pose and
Shape Estimation [20.711789781518753]
We propose a novel attention-based 2D-to-3D pose estimation network for graph-structured data, named KOG-Transformer.
We also propose a 3D pose-to-shape estimation network for hand data, named GASE-Net.
arXiv Detail & Related papers (2022-08-24T06:54:03Z) - Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images [79.70127290464514]
We decompose the task into two stages, i.e. person localization and pose estimation.
And we propose three task-specific graph neural networks for effective message passing.
Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
arXiv Detail & Related papers (2021-09-13T11:44:07Z) - Pose-GNN : Camera Pose Estimation System Using Graph Neural Networks [12.12580095956898]
We propose a novel image based localization system using graph neural networks (GNN)
The pretrained ResNet50 convolutional neural network (CNN) architecture is used to extract the important features for each image.
We show that using GNN leads to enhanced performance for both indoor and outdoor environments.
arXiv Detail & Related papers (2021-03-17T04:40:02Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z) - Rotation-invariant Mixed Graphical Model Network for 2D Hand Pose
Estimation [21.19641797725211]
We propose a new architecture named Rotation-invariant Mixed Graphical Model Network (R-MGMN)
By integrating a rotation net, the R-MGMN is invariant to rotations of the hand in the image.
We evaluate the R-MGMN on two public hand pose datasets.
arXiv Detail & Related papers (2020-02-05T23:05:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.