LOCATE: Localize and Transfer Object Parts for Weakly Supervised
Affordance Grounding
- URL: http://arxiv.org/abs/2303.09665v1
- Date: Thu, 16 Mar 2023 21:47:49 GMT
- Title: LOCATE: Localize and Transfer Object Parts for Weakly Supervised
Affordance Grounding
- Authors: Gen Li, Varun Jampani, Deqing Sun, Laura Sevilla-Lara
- Abstract summary: Humans excel at acquiring knowledge through observation.
A key step to acquire this skill is to identify what part of the object affords each action, which is called affordance grounding.
We propose a framework called LOCATE that can identify matching object parts across images.
- Score: 43.157518990171674
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans excel at acquiring knowledge through observation. For example, we can
learn to use new tools by watching demonstrations. This skill is fundamental
for intelligent systems to interact with the world. A key step to acquire this
skill is to identify what part of the object affords each action, which is
called affordance grounding. In this paper, we address this problem and propose
a framework called LOCATE that can identify matching object parts across
images, to transfer knowledge from images where an object is being used
(exocentric images used for learning), to images where the object is inactive
(egocentric ones used to test). To this end, we first find interaction areas
and extract their feature embeddings. Then we learn to aggregate the embeddings
into compact prototypes (human, object part, and background), and select the
one representing the object part. Finally, we use the selected prototype to
guide affordance grounding. We do this in a weakly supervised manner, learning
only from image-level affordance and object labels. Extensive experiments
demonstrate that our approach outperforms state-of-the-art methods by a large
margin on both seen and unseen objects.
Related papers
- AffordanceLLM: Grounding Affordance from Vision Language Models [36.97072698640563]
Affordance grounding refers to the task of finding the area of an object with which one can interact.
Much of the knowledge is hidden and beyond the image content with the supervised labels from a limited training set.
We make an attempt to improve the generalization capability of the current affordance grounding by taking the advantage of the rich world, abstract, and human-object-interaction knowledge.
arXiv Detail & Related papers (2024-01-12T03:21:02Z) - TAX-Pose: Task-Specific Cross-Pose Estimation for Robot Manipulation [14.011777717620282]
We propose a vision-based system that learns to estimate the cross-pose between two objects for a given manipulation task.
We demonstrate our method's capability to generalize to unseen objects, in some cases after training on only 10 demonstrations in the real world.
arXiv Detail & Related papers (2022-11-17T04:06:16Z) - Self-Supervised Learning of Object Parts for Semantic Segmentation [7.99536002595393]
We argue that self-supervised learning of object parts is a solution to this issue.
Our method surpasses the state-of-the-art on three semantic segmentation benchmarks by 17%-3%.
arXiv Detail & Related papers (2022-04-27T17:55:17Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - PartAfford: Part-level Affordance Discovery from 3D Objects [113.91774531972855]
We present a new task of part-level affordance discovery (PartAfford)
Given only the affordance labels per object, the machine is tasked to (i) decompose 3D shapes into parts and (ii) discover how each part corresponds to a certain affordance category.
We propose a novel learning framework for PartAfford, which discovers part-level representations by leveraging only the affordance set supervision and geometric primitive regularization.
arXiv Detail & Related papers (2022-02-28T02:58:36Z) - Object-Aware Cropping for Self-Supervised Learning [21.79324121283122]
We show that self-supervised learning based on the usual random cropping performs poorly on such datasets.
We propose replacing one or both of the random crops with crops obtained from an object proposal algorithm.
Using this approach, which we call object-aware cropping, results in significant improvements over scene cropping on classification and object detection benchmarks.
arXiv Detail & Related papers (2021-12-01T07:23:37Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - A Simple and Effective Use of Object-Centric Images for Long-Tailed
Object Detection [56.82077636126353]
We take advantage of object-centric images to improve object detection in scene-centric images.
We present a simple yet surprisingly effective framework to do so.
Our approach can improve the object detection (and instance segmentation) accuracy of rare objects by 50% (and 33%) relatively.
arXiv Detail & Related papers (2021-02-17T17:27:21Z) - Improving Object Detection with Selective Self-supervised Self-training [62.792445237541145]
We study how to leverage Web images to augment human-curated object detection datasets.
We retrieve Web images by image-to-image search, which incurs less domain shift from the curated data than other search methods.
We propose a novel learning method motivated by two parallel lines of work that explore unlabeled data for image classification.
arXiv Detail & Related papers (2020-07-17T18:05:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.