Related papers: Spatial and Surface Correspondence Field for Interaction Transfer

Spatial and Surface Correspondence Field for Interaction Transfer

URL: http://arxiv.org/abs/2405.03221v1
Date: Mon, 6 May 2024 07:30:31 GMT
Title: Spatial and Surface Correspondence Field for Interaction Transfer
Authors: Zeyu Huang, Honghao Xu, Haibin Huang, Chongyang Ma, Hui Huang, Ruizhen Hu,
Abstract summary: We introduce a new method for the task of interaction transfer. Our method characterizes the example interaction using a combined spatial and surface representation. Experiments conducted on human-chair and hand-mug interaction transfer tasks show that our approach can handle larger geometry and topology variations.
Score: 27.250373252507547
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we introduce a new method for the task of interaction transfer. Given an example interaction between a source object and an agent, our method can automatically infer both surface and spatial relationships for the agent and target objects within the same category, yielding more accurate and valid transfers. Specifically, our method characterizes the example interaction using a combined spatial and surface representation. We correspond the agent points and object points related to the representation to the target object space using a learned spatial and surface correspondence field, which represents objects as deformed and rotated signed distance fields. With the corresponded points, an optimization is performed under the constraints of our spatial and surface interaction representation and additional regularization. Experiments conducted on human-chair and hand-mug interaction transfer tasks show that our approach can handle larger geometry and topology variations between source and target shapes, significantly outperforming state-of-the-art methods.

Related papers

PartHOI: Part-based Hand-Object Interaction Transfer via Generalized Cylinders [15.1049019475729]
Learning-based methods to understand and model hand-object interactions (HOI) require a large amount of high-quality HOI data. One way to create HOI data is to transfer hand poses from a source object to another based on the objects' geometry. PartHOI establishes a robust geometric correspondence between object parts, and enables the transfer of contact points.
arXiv Detail & Related papers (2025-04-29T09:56:29Z)
Guiding Human-Object Interactions with Rich Geometry and Relations [21.528466852204627]
Existing methods often rely on simplified object representations, such as the object's centroid or the nearest point to a human, to achieve physically plausible motions. We introduce ROG, a novel framework that addresses relationships inherent in HOIs with rich geometric detail. We show that ROG significantly outperforms state-of-the-art methods in the realism evaluations and semantic accuracy of synthesized HOIs.
arXiv Detail & Related papers (2025-03-26T02:57:18Z)
Exploiting Aggregation and Segregation of Representations for Domain Adaptive Human Pose Estimation [50.31351006532924]
Human pose estimation (HPE) has received increasing attention recently due to its wide application in motion analysis, virtual reality, healthcare, etc. It suffers from the lack of labeled diverse real-world datasets due to the time- and labor-intensive annotation. We introduce a novel framework that capitalizes on both representation aggregation and segregation for domain adaptive human pose estimation.
arXiv Detail & Related papers (2024-12-29T17:59:45Z)
Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues. Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z)
Cross-Embodied Affordance Transfer through Learning Affordance Equivalences [6.828097734917722]
We propose a deep neural network model that unifies objects, actions, and effects into a single latent vector in a common latent space that we call the affordance space. Our model does not learn the behavior of individual objects acted upon by a single agent. Affordance Equivalence facilitates not only action generalization over objects but also Cross Embodiment transfer linking actions of different robots.
arXiv Detail & Related papers (2024-04-24T05:07:36Z)
GEARS: Local Geometry-aware Hand-object Interaction Synthesis [38.75942505771009]
We introduce a novel joint-centered sensor designed to reason about local object geometry near potential interaction regions. As an important step towards mitigating the learning complexity, we transform the points from global frame to template hand frame and use a shared module to process sensor features of each individual joint. This is followed by a perceptual-temporal transformer network aimed at capturing correlation among the joints in different dimensions.
arXiv Detail & Related papers (2024-04-02T09:18:52Z)
Gaze-guided Hand-Object Interaction Synthesis: Dataset and Method [63.49140028965778]
We present GazeHOI, the first dataset to capture simultaneous 3D modeling of gaze, hand, and object interactions. To tackle these issues, we propose a stacked gaze-guided hand-object interaction diffusion model, named GHO-Diffusion. We also introduce HOI-Manifold Guidance during the sampling stage of GHO-Diffusion, enabling fine-grained control over generated motions.
arXiv Detail & Related papers (2024-03-24T14:24:13Z)
THOR: Text to Human-Object Interaction Diffusion via Relation Intervention [51.02435289160616]
We propose a novel Text-guided Human-Object Interaction diffusion model with Relation Intervention (THOR) In each diffusion step, we initiate text-guided human and object motion and then leverage human-object relations to intervene in object motion. We construct Text-BEHAVE, a Text2HOI dataset that seamlessly integrates textual descriptions with the currently largest publicly available 3D HOI dataset.
arXiv Detail & Related papers (2024-03-17T13:17:25Z)
Grasp Transfer based on Self-Aligning Implicit Representations of Local Surfaces [10.602143478315861]
This work addresses the problem of transferring a grasp experience or a demonstration to a novel object that shares shape similarities with objects the robot has previously encountered. We employ a single expert grasp demonstration to learn an implicit local surface representation model from a small dataset of object meshes. At inference time, this model is used to transfer grasps to novel objects by identifying the most geometrically similar surfaces to the one on which the expert grasp is demonstrated.
arXiv Detail & Related papers (2023-08-15T14:33:17Z)
Interaction Transformer for Human Reaction Generation [61.22481606720487]
We propose a novel interaction Transformer (InterFormer) consisting of a Transformer network with both temporal and spatial attentions. Our method is general and can be used to generate more complex and long-term interactions.
arXiv Detail & Related papers (2022-07-04T19:30:41Z)
Bi-Dimensional Feature Alignment for Cross-Domain Object Detection [71.85594342357815]
We propose a novel unsupervised cross-domain detection model. It exploits the annotated data in a source domain to train an object detector for a different target domain. The proposed model mitigates the cross-domain representation divergence for object detection.
arXiv Detail & Related papers (2020-11-14T03:03:11Z)
Understanding Spatial Relations through Multiple Modalities [78.07328342973611]
spatial relations between objects can either be explicit -- expressed as spatial prepositions, or implicit -- expressed by spatial verbs such as moving, walking, shifting, etc. We introduce the task of inferring implicit and explicit spatial relations between two entities in an image. We design a model that uses both textual and visual information to predict the spatial relations, making use of both positional and size information of objects and image embeddings.
arXiv Detail & Related papers (2020-07-19T01:35:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.