Related papers: PartHOI: Part-based Hand-Object Interaction Transfer via Generalized Cylinders

PartHOI: Part-based Hand-Object Interaction Transfer via Generalized Cylinders

URL: http://arxiv.org/abs/2504.20599v1
Date: Tue, 29 Apr 2025 09:56:29 GMT
Title: PartHOI: Part-based Hand-Object Interaction Transfer via Generalized Cylinders
Authors: Qiaochu Wang, Chufeng Xiao, Manfred Lau, Hongbo Fu,
Abstract summary: Learning-based methods to understand and model hand-object interactions (HOI) require a large amount of high-quality HOI data.<n>One way to create HOI data is to transfer hand poses from a source object to another based on the objects' geometry.<n>PartHOI establishes a robust geometric correspondence between object parts, and enables the transfer of contact points.
Score: 15.1049019475729
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning-based methods to understand and model hand-object interactions (HOI) require a large amount of high-quality HOI data. One way to create HOI data is to transfer hand poses from a source object to another based on the objects' geometry. However, current methods for transferring hand poses between objects rely on shape matching, limiting the ability to transfer poses across different categories due to differences in their shapes and sizes. We observe that HOI often involves specific semantic parts of objects, which often have more consistent shapes across categories. In addition, constructing size-invariant correspondences between these parts is important for cross-category transfer. Based on these insights, we introduce a novel method PartHOI for part-based HOI transfer. Using a generalized cylinder representation to parameterize an object parts' geometry, PartHOI establishes a robust geometric correspondence between object parts, and enables the transfer of contact points. Given the transferred points, we optimize a hand pose to fit the target object well. Qualitative and quantitative results demonstrate that our method can generalize HOI transfers well even for cross-category objects, and produce high-fidelity results that are superior to the existing methods.

Related papers

Detection Based Part-level Articulated Object Reconstruction from Single RGBD Image [52.11275397911693]
We propose an end-to-end trainable, cross-category method for reconstructing multiple man-made articulated objects from a single RGBD image.<n>We depart from previous works that rely on learning instance-level latent space, focusing on man-made articulated objects with predefined part counts.<n>Our method successfully reconstructs variously structured multiple instances that previous works cannot handle, and outperforms prior works in shape reconstruction and kinematics estimation.
arXiv Detail & Related papers (2025-04-04T05:08:04Z)
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model [72.90370736032115]
We present a novel video Reenactment framework focusing on Human-Object Interaction (HOI) via an adaptive layout-instructed Diffusion model (Re-HOLD) Our key insight is to employ specialized layout representation for hands and objects, respectively. To further improve the generation quality of HOI, we design an interactive textural enhancement module for both hands and objects.
arXiv Detail & Related papers (2025-03-21T08:40:35Z)
Interior Object Geometry via Fitted Frames [17.891216185367398]
We describe a representation targeted for anatomic objects which is designed to enable strong locational correspondence within object populations.<n>We show notably improved classification performance by this new representation, which we call the evolutionary s-rep.<n>The geometric features that are derived from each of the representations, especially via fitted frames, are discussed.
arXiv Detail & Related papers (2024-07-19T14:38:47Z)
Implicit Modeling of Non-rigid Objects with Cross-Category Signals [28.956412015920936]
MODIF is a multi-object deep implicit function that jointly learns the deformation fields and instance-specific latent codes for multiple objects at once. We show that MODIF can proficiently learn the shape representation of each organ and their relations to others, to the point that shapes missing from unseen instances can be consistently recovered.
arXiv Detail & Related papers (2023-12-15T22:34:17Z)
CHORD: Category-level Hand-held Object Reconstruction via Shape Deformation [40.58622555407404]
In daily life, humans utilize hands to manipulate objects. Previous approaches have encountered difficulties in reconstructing the precise shapes of hand-held objects. We propose a new method, CHORD, for Category-level Hand-held Object Reconstruction via shape Deformation.
arXiv Detail & Related papers (2023-08-21T09:14:18Z)
Pairwise-Constrained Implicit Functions for 3D Human Heart Modelling [60.56741715207466]
We introduce a pairwise-constrained SDF approach that models the heart as a set of interdependent SDFs.<n>Our method significantly improves inner structure accuracy over single-SDF, UDF-based, voxel-based, and segmentation-based reconstructions.
arXiv Detail & Related papers (2023-07-16T10:07:15Z)
LocPoseNet: Robust Location Prior for Unseen Object Pose Estimation [69.70498875887611]
LocPoseNet is able to robustly learn location prior for unseen objects. Our method outperforms existing works by a large margin on LINEMOD and GenMOP.
arXiv Detail & Related papers (2022-11-29T15:21:34Z)
Interacting Hand-Object Pose Estimation via Dense Mutual Attention [97.26400229871888]
3D hand-object pose estimation is the key to the success of many computer vision applications. We propose a novel dense mutual attention mechanism that is able to model fine-grained dependencies between the hand and the object. Our method is able to produce physically plausible poses with high quality and real-time inference speed.
arXiv Detail & Related papers (2022-11-16T10:01:33Z)
Generative Category-Level Shape and Pose Estimation with Semantic Primitives [27.692997522812615]
We propose a novel framework for category-level object shape and pose estimation from a single RGB-D image. To handle the intra-category variation, we adopt a semantic primitive representation that encodes diverse shapes into a unified latent space. We show that the proposed method achieves SOTA pose estimation performance and better generalization in the real-world dataset.
arXiv Detail & Related papers (2022-10-03T17:51:54Z)
Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph Convolution [49.10497573378427]
Estimating the pose and shape of hands and objects under interaction finds numerous applications including augmented and virtual reality. Our algorithm is optimisation to object models, and it learns the physical rules governing hand-object interaction. Experiments using four widely-used benchmarks show that our framework achieves beyond state-of-the-art accuracy in 3D pose estimation, as well as recovers dense 3D hand and object shapes.
arXiv Detail & Related papers (2022-04-27T17:00:54Z)
Continuous Surface Embeddings [76.86259029442624]
We focus on the task of learning and representing dense correspondences in deformable object categories. We propose a new, learnable image-based representation of dense correspondences. We demonstrate that the proposed approach performs on par or better than the state-of-the-art methods for dense pose estimation for humans.
arXiv Detail & Related papers (2020-11-24T22:52:15Z)
Joint Hand-object 3D Reconstruction from a Single Image with Cross-branch Feature Fusion [78.98074380040838]
We propose to consider hand and object jointly in feature space and explore the reciprocity of the two branches. We employ an auxiliary depth estimation module to augment the input RGB image with the estimated depth map. Our approach significantly outperforms existing approaches in terms of the reconstruction accuracy of objects.
arXiv Detail & Related papers (2020-06-28T09:50:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.