Precise Affordance Annotation for Egocentric Action Video Datasets
- URL: http://arxiv.org/abs/2206.05424v1
- Date: Sat, 11 Jun 2022 05:13:19 GMT
- Title: Precise Affordance Annotation for Egocentric Action Video Datasets
- Authors: Zecheng Yu, Yifei Huang, Ryosuke Furuta, Takuma Yagi, Yusuke Goutsu,
Yoichi Sato
- Abstract summary: Object affordance is an important concept in human-object interaction.
Existing datasets often mix up affordance with object functionality.
We introduce the concept of mechanical action to represent the action possibilities between two objects.
- Score: 27.90643693526274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object affordance is an important concept in human-object interaction,
providing information on action possibilities based on human motor capacity and
objects' physical property thus benefiting tasks such as action anticipation
and robot imitation learning. However, existing datasets often: 1) mix up
affordance with object functionality; 2) confuse affordance with goal-related
action; and 3) ignore human motor capacity. This paper proposes an efficient
annotation scheme to address these issues by combining goal-irrelevant motor
actions and grasp types as affordance labels and introducing the concept of
mechanical action to represent the action possibilities between two objects. We
provide new annotations by applying this scheme to the EPIC-KITCHENS dataset
and test our annotation with tasks such as affordance recognition. We
qualitatively verify that models trained with our annotation can distinguish
affordance and mechanical actions.
Related papers
- Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition [21.655278000690686]
We propose an end-to-end object-centric action recognition framework.
It simultaneously performs Detection And Interaction Reasoning in one stage.
We conduct experiments on two datasets, Something-Else and Ikea-Assembly.
arXiv Detail & Related papers (2024-04-18T05:06:12Z) - Self-Explainable Affordance Learning with Embodied Caption [63.88435741872204]
We introduce Self-Explainable Affordance learning (SEA) with embodied caption.
SEA enables robots to articulate their intentions and bridge the gap between explainable vision-language caption and visual affordance learning.
We propose a novel model to effectively combine affordance grounding with self-explanation in a simple but efficient manner.
arXiv Detail & Related papers (2024-04-08T15:22:38Z) - Localizing Active Objects from Egocentric Vision with Symbolic World
Knowledge [62.981429762309226]
The ability to actively ground task instructions from an egocentric view is crucial for AI agents to accomplish tasks or assist humans virtually.
We propose to improve phrase grounding models' ability on localizing the active objects by: learning the role of objects undergoing change and extracting them accurately from the instructions.
We evaluate our framework on Ego4D and Epic-Kitchens datasets.
arXiv Detail & Related papers (2023-10-23T16:14:05Z) - Leveraging Next-Active Objects for Context-Aware Anticipation in
Egocentric Videos [31.620555223890626]
We study the problem of Short-Term Object interaction anticipation (STA)
We propose NAOGAT, a multi-modal end-to-end transformer network, to guide the model to predict context-aware future actions.
Our model outperforms existing methods on two separate datasets.
arXiv Detail & Related papers (2023-08-16T12:07:02Z) - Object Discovery from Motion-Guided Tokens [50.988525184497334]
We augment the auto-encoder representation learning framework with motion-guidance and mid-level feature tokenization.
Our approach enables the emergence of interpretable object-specific mid-level features.
arXiv Detail & Related papers (2023-03-27T19:14:00Z) - Fine-grained Affordance Annotation for Egocentric Hand-Object
Interaction Videos [27.90643693526274]
Object affordance provides information on action possibilities based on human motor capacity and objects' physical property.
This paper proposes an efficient annotation scheme to address these issues by combining goal-irrelevant motor actions and grasp types as affordance labels.
We provide new annotations by applying this scheme to the EPIC-KITCHENS dataset and test our annotation with tasks such as affordance recognition, hand-object interaction hotspots prediction, and cross-domain evaluation of affordance.
arXiv Detail & Related papers (2023-02-07T07:05:00Z) - Learn to Predict How Humans Manipulate Large-sized Objects from
Interactive Motions [82.90906153293585]
We propose a graph neural network, HO-GCN, to fuse motion data and dynamic descriptors for the prediction task.
We show the proposed network that consumes dynamic descriptors can achieve state-of-the-art prediction results and help the network better generalize to unseen objects.
arXiv Detail & Related papers (2022-06-25T09:55:39Z) - Improving Object Permanence using Agent Actions and Reasoning [8.847502932609737]
Existing approaches learn object permanence from low-level perception.
We argue that object permanence can be improved when the robot uses knowledge about executed actions.
arXiv Detail & Related papers (2021-10-01T07:09:49Z) - Property-Aware Robot Object Manipulation: a Generative Approach [57.70237375696411]
In this work, we focus on how to generate robot motion adapted to the hidden properties of the manipulated objects.
We explore the possibility of leveraging Generative Adversarial Networks to synthesize new actions coherent with the properties of the object.
Our results show that Generative Adversarial Nets can be a powerful tool for the generation of novel and meaningful transportation actions.
arXiv Detail & Related papers (2021-06-08T14:15:36Z) - Careful with That! Observation of Human Movements to Estimate Objects
Properties [106.925705883949]
We focus on the features of human motor actions that communicate insights on the weight of an object.
Our final goal is to enable a robot to autonomously infer the degree of care required in object handling.
arXiv Detail & Related papers (2021-03-02T08:14:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.