Related papers: Structural Concept Learning via Graph Attention for Multi-Level Rearrangement Planning

Structural Concept Learning via Graph Attention for Multi-Level Rearrangement Planning

URL: http://arxiv.org/abs/2309.02547v1
Date: Tue, 5 Sep 2023 19:35:44 GMT
Title: Structural Concept Learning via Graph Attention for Multi-Level Rearrangement Planning
Authors: Manav Kulshrestha and Ahmed H. Qureshi
Abstract summary: We propose a deep learning approach to perform multi-level object rearrangement planning for scenes with structural dependency hierarchies. It is trained on a self-generated simulation data set with intuitive structures and works for unseen scenes with an arbitrary number of objects. We compare our method with a range of classical and model-based baselines to show that our method leverages its scene understanding to achieve better performance, flexibility, and efficiency.
Score: 2.7195102129095003
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robotic manipulation tasks, such as object rearrangement, play a crucial role in enabling robots to interact with complex and arbitrary environments. Existing work focuses primarily on single-level rearrangement planning and, even if multiple levels exist, dependency relations among substructures are geometrically simpler, like tower stacking. We propose Structural Concept Learning (SCL), a deep learning approach that leverages graph attention networks to perform multi-level object rearrangement planning for scenes with structural dependency hierarchies. It is trained on a self-generated simulation data set with intuitive structures, works for unseen scenes with an arbitrary number of objects and higher complexity of structures, infers independent substructures to allow for task parallelization over multiple manipulators, and generalizes to the real world. We compare our method with a range of classical and model-based baselines to show that our method leverages its scene understanding to achieve better performance, flexibility, and efficiency. The dataset, supplementary details, videos, and code implementation are available at: https://manavkulshrestha.github.io/scl

Related papers

A representational framework for learning and encoding structurally enriched trajectories in complex agent environments [1.904851064759821]
The ability of artificial intelligence agents to make optimal decisions and generalise them to different domains and tasks is compromised in complex scenarios. One way to address this issue has focused on learning efficient representations of the world and on how the actions of agents affect them, such as disentangled representations that exploit symmetries. We propose to enrich the agent's ontology and extend the traditionalisation of trajectories to provide a more nuanced view of task execution.
arXiv Detail & Related papers (2025-03-17T14:04:27Z)
Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies. Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors. We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z)
From Logits to Hierarchies: Hierarchical Clustering made Simple [16.132657141993548]
We show that a lightweight procedure implemented on top of pre-trained non-hierarchical clustering models outperforms models designed specifically for hierarchical clustering. Our proposed approach is computationally efficient and applicable to any pre-trained clustering model that outputs logits, without requiring any fine-tuning.
arXiv Detail & Related papers (2024-10-10T12:27:45Z)
Implant Global and Local Hierarchy Information to Sequence based Code Representation Models [25.776540440893257]
We analyze how the complete hierarchical structure influences the tokens in code sequences and abstract this influence as a property of code tokens called hierarchical embedding. We propose the Hierarchy Transformer (HiT), a simple but effective sequence model to incorporate the complete hierarchical embeddings of source code into a Transformer model.
arXiv Detail & Related papers (2023-03-14T12:01:39Z)
StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects [35.855172217856726]
We propose StructDiffusion to build physically-valid structures without step-by-step instructions. Our method can perform multiple challenging language-conditioned multi-step 3D planning tasks. We show experiments on held-out objects in both simulation and on real-world tasks.
arXiv Detail & Related papers (2022-11-08T23:04:49Z)
ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy [65.5580334698777]
ViRel is a method for unsupervised discovery and learning of Visual Relations with graph-level analogy. We show that our method achieves above 95% accuracy in relation classification. We further generalizes to unseen tasks with more complicated relational structures.
arXiv Detail & Related papers (2022-07-04T16:56:45Z)
Fast Inference and Transfer of Compositional Task Structures for Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph. Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks. Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z)
Policy Architectures for Compositional Generalization in Control [71.61675703776628]
We introduce a framework for modeling entity-based compositional structure in tasks. Our policies are flexible and can be trained end-to-end without requiring any action primitives.
arXiv Detail & Related papers (2022-03-10T06:44:24Z)
Differentiable Architecture Pruning for Transfer Learning [6.935731409563879]
We propose a gradient-based approach for extracting sub-architectures from a given large model. Our architecture-pruning scheme produces transferable new structures that can be successfully retrained to solve different tasks. We provide theoretical convergence guarantees and validate the proposed transfer-learning strategy on real data.
arXiv Detail & Related papers (2021-07-07T17:44:59Z)
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning [138.40338621974954]
CausalWorld is a benchmark for causal structure and transfer learning in a robotic manipulation environment. Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures.
arXiv Detail & Related papers (2020-10-08T23:01:13Z)
Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features. Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.