Act the Part: Learning Interaction Strategies for Articulated Object
Part Discovery
- URL: http://arxiv.org/abs/2105.01047v1
- Date: Mon, 3 May 2021 17:48:29 GMT
- Title: Act the Part: Learning Interaction Strategies for Articulated Object
Part Discovery
- Authors: Samir Yitzhak Gadre, Kiana Ehsani, Shuran Song
- Abstract summary: We introduce Act the Part (AtP) to learn how to interact with articulated objects to discover and segment their pieces.
Our experiments show AtP learns efficient strategies for part discovery, can generalize to unseen categories, and is capable of conditional reasoning for the task.
- Score: 18.331607910407183
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: People often use physical intuition when manipulating articulated objects,
irrespective of object semantics. Motivated by this observation, we identify an
important embodied task where an agent must play with objects to recover their
parts. To this end, we introduce Act the Part (AtP) to learn how to interact
with articulated objects to discover and segment their pieces. By coupling
action selection and motion segmentation, AtP is able to isolate structures to
make perceptual part recovery possible without semantic labels. Our experiments
show AtP learns efficient strategies for part discovery, can generalize to
unseen categories, and is capable of conditional reasoning for the task.
Although trained in simulation, we show convincing transfer to real world data
with no fine-tuning.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition [21.655278000690686]
We propose an end-to-end object-centric action recognition framework.
It simultaneously performs Detection And Interaction Reasoning in one stage.
We conduct experiments on two datasets, Something-Else and Ikea-Assembly.
arXiv Detail & Related papers (2024-04-18T05:06:12Z) - SAGE: Bridging Semantic and Actionable Parts for GEneralizable Manipulation of Articulated Objects [9.500480417077272]
We propose a novel framework that bridges semantic and actionable parts of articulated objects to achieve generalizable manipulation under natural language instructions.
A part-grounding module maps the semantic parts into so-called Generalizable Actionable Parts (GAParts), which inherently carry information about part motion.
An interactive feedback module is incorporated to respond to failures, which closes the loop and increases the robustness of the overall framework.
arXiv Detail & Related papers (2023-12-03T07:22:42Z) - Compositional Learning in Transformer-Based Human-Object Interaction
Detection [6.630793383852106]
Long-tailed distribution of labeled instances is a primary challenge in HOI detection.
Inspired by the nature of HOI triplets, some existing approaches adopt the idea of compositional learning.
We creatively propose a transformer-based framework for compositional HOI learning.
arXiv Detail & Related papers (2023-08-11T06:41:20Z) - PartManip: Learning Cross-Category Generalizable Part Manipulation
Policy from Point Cloud Observations [12.552149411655355]
We build the first large-scale, part-based cross-category object manipulation benchmark, PartManip.
We train a state-based expert with our proposed part-based canonicalization and part-aware rewards, and then distill the knowledge to a vision-based student.
For cross-category generalization, we introduce domain adversarial learning for domain-invariant feature extraction.
arXiv Detail & Related papers (2023-03-29T18:29:30Z) - Self-Supervised Visual Representation Learning with Semantic Grouping [50.14703605659837]
We tackle the problem of learning visual representations from unlabeled scene-centric data.
We propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning.
arXiv Detail & Related papers (2022-05-30T17:50:59Z) - PartAfford: Part-level Affordance Discovery from 3D Objects [113.91774531972855]
We present a new task of part-level affordance discovery (PartAfford)
Given only the affordance labels per object, the machine is tasked to (i) decompose 3D shapes into parts and (ii) discover how each part corresponds to a certain affordance category.
We propose a novel learning framework for PartAfford, which discovers part-level representations by leveraging only the affordance set supervision and geometric primitive regularization.
arXiv Detail & Related papers (2022-02-28T02:58:36Z) - Skeleton-Based Mutually Assisted Interacted Object Localization and
Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data.
Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z) - Plug and Play, Model-Based Reinforcement Learning [60.813074750879615]
We introduce an object-based representation that allows zero-shot integration of new objects from known object classes.
This is achieved by representing the global transition dynamics as a union of local transition functions.
Experiments show that our representation can achieve sample-efficiency in a variety of set-ups.
arXiv Detail & Related papers (2021-08-20T01:20:15Z) - Constellation: Learning relational abstractions over objects for
compositional imagination [64.99658940906917]
We introduce Constellation, a network that learns relational abstractions of static visual scenes.
This work is a first step in the explicit representation of visual relationships and using them for complex cognitive procedures.
arXiv Detail & Related papers (2021-07-23T11:59:40Z) - Object-Centric Learning with Slot Attention [43.684193749891506]
We present the Slot Attention module, an architectural component that interfaces with perceptual representations.
Slot Attention produces task-dependent abstract representations which we call slots.
We empirically demonstrate that Slot Attention can extract object-centric representations that enable generalization to unseen compositions.
arXiv Detail & Related papers (2020-06-26T15:31:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.