Monocular Human Shape and Pose with Dense Mesh-borne Local Image
Features
- URL: http://arxiv.org/abs/2111.05319v3
- Date: Thu, 11 Nov 2021 08:38:08 GMT
- Title: Monocular Human Shape and Pose with Dense Mesh-borne Local Image
Features
- Authors: Shubhendu Jena, Franck Multon, Adnane Boukhayma
- Abstract summary: We propose to improve on graph convolution based approaches for human shape and pose estimation using pixel-aligned local image features.
Our results on standard benchmarks show that using local features improves on global ones and leads to competitive performances with respect to the state-of-the-art.
- Score: 8.422257363944295
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose to improve on graph convolution based approaches for human shape
and pose estimation from monocular input, using pixel-aligned local image
features. Given a single input color image, existing graph convolutional
network (GCN) based techniques for human shape and pose estimation use a single
convolutional neural network (CNN) generated global image feature appended to
all mesh vertices equally to initialize the GCN stage, which transforms a
template T-posed mesh into the target pose. In contrast, we propose for the
first time the idea of using local image features per vertex. These features
are sampled from the CNN image feature maps by utilizing pixel-to-mesh
correspondences generated with DensePose. Our quantitative and qualitative
results on standard benchmarks show that using local features improves on
global ones and leads to competitive performances with respect to the
state-of-the-art.
Related papers
- Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - PoNQ: a Neural QEM-based Mesh Representation [33.81124790808585]
We introduce a learnable mesh representation through a set of local 3D sample Points and their associated Normals and Quadric error metrics (QEM)
A global mesh is directly derived from PoNQ by efficiently leveraging the knowledge of the local quadric errors.
We demonstrate the efficacy of PoNQ through a learning-based mesh prediction from SDF grids.
arXiv Detail & Related papers (2024-03-19T16:15:08Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Learning Self-Prior for Mesh Inpainting Using Self-Supervised Graph Convolutional Networks [4.424836140281846]
We present a self-prior-based mesh inpainting framework that requires only an incomplete mesh as input.
Our method maintains the polygonal mesh format throughout the inpainting process.
We demonstrate that our method outperforms traditional dataset-independent approaches.
arXiv Detail & Related papers (2023-05-01T02:51:38Z) - Shape Preserving Facial Landmarks with Graph Attention Networks [3.996275177789895]
We propose a model based on the combination of a CNN with a cascade of Graph Attention Network regressors.
We introduce an encoding that jointly represents the appearance and location of facial landmarks and an attention mechanism to weigh the information according to its reliability.
Experiments confirm that the proposed model learns a global representation of the structure of the face, achieving top performance in popular benchmarks on head pose and landmark estimation.
arXiv Detail & Related papers (2022-10-13T17:58:02Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - Learning Spatial Context with Graph Neural Network for Multi-Person Pose
Grouping [71.59494156155309]
Bottom-up approaches for image-based multi-person pose estimation consist of two stages: keypoint detection and grouping.
In this work, we formulate the grouping task as a graph partitioning problem, where we learn the affinity matrix with a Graph Neural Network (GNN)
The learned geometry-based affinity is further fused with appearance-based affinity to achieve robust keypoint association.
arXiv Detail & Related papers (2021-04-06T09:21:14Z) - Pose-GNN : Camera Pose Estimation System Using Graph Neural Networks [12.12580095956898]
We propose a novel image based localization system using graph neural networks (GNN)
The pretrained ResNet50 convolutional neural network (CNN) architecture is used to extract the important features for each image.
We show that using GNN leads to enhanced performance for both indoor and outdoor environments.
arXiv Detail & Related papers (2021-03-17T04:40:02Z) - Locality Preserving Dense Graph Convolutional Networks with Graph
Context-Aware Node Representations [19.623379678611744]
Graph convolutional networks (GCNs) have been widely used for representation learning on graph data.
In many graph classification applications, GCN-based approaches have outperformed traditional methods.
We propose a locality-preserving dense GCN with graph context-aware node representations.
arXiv Detail & Related papers (2020-10-12T02:12:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.