Structural Concept Learning via Graph Attention for Multi-Level
Rearrangement Planning
- URL: http://arxiv.org/abs/2309.02547v1
- Date: Tue, 5 Sep 2023 19:35:44 GMT
- Title: Structural Concept Learning via Graph Attention for Multi-Level
Rearrangement Planning
- Authors: Manav Kulshrestha and Ahmed H. Qureshi
- Abstract summary: We propose a deep learning approach to perform multi-level object rearrangement planning for scenes with structural dependency hierarchies.
It is trained on a self-generated simulation data set with intuitive structures and works for unseen scenes with an arbitrary number of objects.
We compare our method with a range of classical and model-based baselines to show that our method leverages its scene understanding to achieve better performance, flexibility, and efficiency.
- Score: 2.7195102129095003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robotic manipulation tasks, such as object rearrangement, play a crucial role
in enabling robots to interact with complex and arbitrary environments.
Existing work focuses primarily on single-level rearrangement planning and,
even if multiple levels exist, dependency relations among substructures are
geometrically simpler, like tower stacking. We propose Structural Concept
Learning (SCL), a deep learning approach that leverages graph attention
networks to perform multi-level object rearrangement planning for scenes with
structural dependency hierarchies. It is trained on a self-generated simulation
data set with intuitive structures, works for unseen scenes with an arbitrary
number of objects and higher complexity of structures, infers independent
substructures to allow for task parallelization over multiple manipulators, and
generalizes to the real world. We compare our method with a range of classical
and model-based baselines to show that our method leverages its scene
understanding to achieve better performance, flexibility, and efficiency. The
dataset, supplementary details, videos, and code implementation are available
at: https://manavkulshrestha.github.io/scl
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - From Logits to Hierarchies: Hierarchical Clustering made Simple [16.132657141993548]
We show that a lightweight procedure implemented on top of pre-trained non-hierarchical clustering models outperforms models designed specifically for hierarchical clustering.
Our proposed approach is computationally efficient and applicable to any pre-trained clustering model that outputs logits, without requiring any fine-tuning.
arXiv Detail & Related papers (2024-10-10T12:27:45Z) - Implant Global and Local Hierarchy Information to Sequence based Code
Representation Models [25.776540440893257]
We analyze how the complete hierarchical structure influences the tokens in code sequences and abstract this influence as a property of code tokens called hierarchical embedding.
We propose the Hierarchy Transformer (HiT), a simple but effective sequence model to incorporate the complete hierarchical embeddings of source code into a Transformer model.
arXiv Detail & Related papers (2023-03-14T12:01:39Z) - StructDiffusion: Language-Guided Creation of Physically-Valid Structures
using Unseen Objects [35.855172217856726]
We propose StructDiffusion to build physically-valid structures without step-by-step instructions.
Our method can perform multiple challenging language-conditioned multi-step 3D planning tasks.
We show experiments on held-out objects in both simulation and on real-world tasks.
arXiv Detail & Related papers (2022-11-08T23:04:49Z) - ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy [65.5580334698777]
ViRel is a method for unsupervised discovery and learning of Visual Relations with graph-level analogy.
We show that our method achieves above 95% accuracy in relation classification.
We further generalizes to unseen tasks with more complicated relational structures.
arXiv Detail & Related papers (2022-07-04T16:56:45Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Policy Architectures for Compositional Generalization in Control [71.61675703776628]
We introduce a framework for modeling entity-based compositional structure in tasks.
Our policies are flexible and can be trained end-to-end without requiring any action primitives.
arXiv Detail & Related papers (2022-03-10T06:44:24Z) - Differentiable Architecture Pruning for Transfer Learning [6.935731409563879]
We propose a gradient-based approach for extracting sub-architectures from a given large model.
Our architecture-pruning scheme produces transferable new structures that can be successfully retrained to solve different tasks.
We provide theoretical convergence guarantees and validate the proposed transfer-learning strategy on real data.
arXiv Detail & Related papers (2021-07-07T17:44:59Z) - CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and
Transfer Learning [138.40338621974954]
CausalWorld is a benchmark for causal structure and transfer learning in a robotic manipulation environment.
Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures.
arXiv Detail & Related papers (2020-10-08T23:01:13Z) - Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features.
Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.