Grounded Affordance from Exocentric View
- URL: http://arxiv.org/abs/2208.13196v2
- Date: Thu, 25 May 2023 05:47:29 GMT
- Title: Grounded Affordance from Exocentric View
- Authors: Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, Dacheng Tao
- Abstract summary: Affordance grounding aims to locate objects' "action possibilities" regions.
Due to the diversity of interactive affordance, the uniqueness of different individuals leads to diverse interactions.
Human has the ability that transforms the various exocentric interactions into invariant egocentric affordance.
- Score: 79.64064711636975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Affordance grounding aims to locate objects' "action possibilities" regions,
which is an essential step toward embodied intelligence. Due to the diversity
of interactive affordance, the uniqueness of different individuals leads to
diverse interactions, which makes it difficult to establish an explicit link
between object parts and affordance labels. Human has the ability that
transforms the various exocentric interactions into invariant egocentric
affordance to counter the impact of interactive diversity. To empower an agent
with such ability, this paper proposes a task of affordance grounding from
exocentric view, i.e., given exocentric human-object interaction and egocentric
object images, learning the affordance knowledge of the object and transferring
it to the egocentric image using only the affordance label as supervision.
However, there is some "interaction bias" between personas, mainly regarding
different regions and different views. To this end, we devise a cross-view
affordance knowledge transfer framework that extracts affordance-specific
features from exocentric interactions and transfers them to the egocentric
view. Specifically, the perception of affordance regions is enhanced by
preserving affordance co-relations. In addition, an affordance grounding
dataset named AGD20K is constructed by collecting and labeling over 20K images
from $36$ affordance categories. Experimental results demonstrate that our
method outperforms the representative models regarding objective metrics and
visual quality. Code is released at
https://github.com/lhc1224/Cross-view-affordance-grounding.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding [10.787807888885888]
We propose INTeraction Relationship-aware weakly supervised Affordance grounding (INTRA)
Unlike prior arts, INTRA recasts this problem as representation learning to identify unique features of interactions through contrastive learning with exocentric images only.
Our method outperformed prior arts on diverse datasets such as AGD20K, IIT-AFF, CAD and UMD.
arXiv Detail & Related papers (2024-09-10T04:31:51Z) - EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views [51.53089073920215]
Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-centric perception.
Existing methods primarily leverage observations of HOI to capture interaction regions from an exocentric view.
We present EgoChoir, which links object structures with interaction contexts inherent in appearance and head motion to reveal object affordance.
arXiv Detail & Related papers (2024-05-22T14:03:48Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - Grounding 3D Object Affordance from 2D Interactions in Images [128.6316708679246]
Grounding 3D object affordance seeks to locate objects' ''action possibilities'' regions in the 3D space.
Humans possess the ability to perceive object affordances in the physical world through demonstration images or videos.
We devise an Interaction-driven 3D Affordance Grounding Network (IAG), which aligns the region feature of objects from different sources.
arXiv Detail & Related papers (2023-03-18T15:37:35Z) - Learning Affordance Grounding from Exocentric Images [79.64064711636975]
Affordance grounding is a task to ground (i.e., localize) action possibility region in objects.
Human has the ability that transform the various exocentric interactions to invariant egocentric affordance.
This paper proposes a task of affordance grounding from exocentric view, i.e. given exocentric human-object interaction and egocentric object images.
arXiv Detail & Related papers (2022-03-18T12:29:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.