Dynamic Hyperbolic Attention Network for Fine Hand-object Reconstruction
- URL: http://arxiv.org/abs/2309.02965v1
- Date: Wed, 6 Sep 2023 13:00:10 GMT
- Title: Dynamic Hyperbolic Attention Network for Fine Hand-object Reconstruction
- Authors: Zhiying Leng, Shun-Cheng Wu, Mahdi Saleh, Antonio Montanaro, Hao Yu,
Yin Wang, Nassir Navab, Xiaohui Liang, Federico Tombari
- Abstract summary: We propose the first precise hand-object reconstruction method in hyperbolic space, namely Dynamic Hyperbolic Attention Network (DHANet)
Our method learns mesh features with rich geometry-image multi-modal information and models better hand-object interaction.
- Score: 76.5549647815413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstructing both objects and hands in 3D from a single RGB image is
complex. Existing methods rely on manually defined hand-object constraints in
Euclidean space, leading to suboptimal feature learning. Compared with
Euclidean space, hyperbolic space better preserves the geometric properties of
meshes thanks to its exponentially-growing space distance, which amplifies the
differences between the features based on similarity. In this work, we propose
the first precise hand-object reconstruction method in hyperbolic space, namely
Dynamic Hyperbolic Attention Network (DHANet), which leverages intrinsic
properties of hyperbolic space to learn representative features. Our method
that projects mesh and image features into a unified hyperbolic space includes
two modules, ie. dynamic hyperbolic graph convolution and image-attention
hyperbolic graph convolution. With these two modules, our method learns mesh
features with rich geometry-image multi-modal information and models better
hand-object interaction. Our method provides a promising alternative for fine
hand-object reconstruction in hyperbolic space. Extensive experiments on three
public datasets demonstrate that our method outperforms most state-of-the-art
methods.
Related papers
- Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - Ghost on the Shell: An Expressive Representation of General 3D Shapes [97.76840585617907]
Meshes are appealing since they enable fast physics-based rendering with realistic material and lighting.
Recent work on reconstructing and statistically modeling 3D shapes has critiqued meshes as being topologically inflexible.
We parameterize open surfaces by defining a manifold signed distance field on watertight surfaces.
G-Shell achieves state-of-the-art performance on non-watertight mesh reconstruction and generation tasks.
arXiv Detail & Related papers (2023-10-23T17:59:52Z) - HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal
Prototypes [7.665392786787577]
We use hyperbolic representation space for self-supervised representation learning for prototype-based clustering approaches.
We extend the Masked Siamese Networks to operate on the Poincar'e ball model of hyperbolic space.
Unlike previous methods we project to the hyperbolic space at the output of the encoder network and utilise a hyperbolic projection head to ensure that the representations used for downstream tasks remain hyperbolic.
arXiv Detail & Related papers (2023-05-18T12:38:40Z) - Learning Pose Image Manifolds Using Geometry-Preserving GANs and
Elasticae [13.202747831999414]
Geometric Style-GAN (Geom-SGAN) maps images to low-dimensional latent representations.
Euler's elastica smoothly interpolate between directed points (points + tangent directions) in the low-dimensional latent space.
arXiv Detail & Related papers (2023-05-17T18:45:56Z) - Decoupled Iterative Refinement Framework for Interacting Hands
Reconstruction from a Single RGB Image [30.24438569170251]
We propose a decoupled iterative refinement framework to achieve pixel-alignment hand reconstruction.
Our method outperforms all existing two-hand reconstruction methods by a large margin on the InterHand2.6M dataset.
arXiv Detail & Related papers (2023-02-05T15:46:57Z) - MvDeCor: Multi-view Dense Correspondence Learning for Fine-grained 3D
Segmentation [91.6658845016214]
We propose to utilize self-supervised techniques in the 2D domain for fine-grained 3D shape segmentation tasks.
We render a 3D shape from multiple views, and set up a dense correspondence learning task within the contrastive learning framework.
As a result, the learned 2D representations are view-invariant and geometrically consistent.
arXiv Detail & Related papers (2022-08-18T00:48:15Z) - HRCF: Enhancing Collaborative Filtering via Hyperbolic Geometric
Regularization [52.369435664689995]
We introduce a textitHyperbolic Regularization powered Collaborative Filtering (HRCF) and design a geometric-aware hyperbolic regularizer.
Specifically, the proposal boosts optimization procedure via the root alignment and origin-aware penalty.
Our proposal is able to tackle the over-smoothing problem caused by hyperbolic aggregation and also brings the models a better discriminative ability.
arXiv Detail & Related papers (2022-04-18T06:11:44Z) - Enhancing Hyperbolic Graph Embeddings via Contrastive Learning [7.901082408569372]
We propose a novel Hyperbolic Graph Contrastive Learning (HGCL) framework which learns node representations through multiple hyperbolic spaces.
Experimental results on multiple real-world datasets demonstrate the superiority of the proposed HGCL.
arXiv Detail & Related papers (2022-01-21T06:10:05Z) - Disentangling and Unifying Graph Convolutions for Skeleton-Based Action
Recognition [79.33539539956186]
We propose a simple method to disentangle multi-scale graph convolutions and a unified spatial-temporal graph convolutional operator named G3D.
By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets.
arXiv Detail & Related papers (2020-03-31T11:28:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.