Related papers: Neural Constraint Satisfaction: Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement

Neural Constraint Satisfaction: Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement

URL: http://arxiv.org/abs/2303.11373v1
Date: Mon, 20 Mar 2023 18:19:36 GMT
Title: Neural Constraint Satisfaction: Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement
Authors: Michael Chang and Alyssa L. Dayan and Franziska Meier and Thomas L. Griffiths and Sergey Levine and Amy Zhang
Abstract summary: We present a hierarchical abstraction approach to uncover underlying entities. We show how to learn a correspondence between intervening on states of entities in the agent's model and acting on objects in the environment. We use this correspondence to develop a method for control that generalizes to different numbers and configurations of objects.
Score: 75.9289887536165
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Object rearrangement is a challenge for embodied agents because solving these tasks requires generalizing across a combinatorially large set of configurations of entities and their locations. Worse, the representations of these entities are unknown and must be inferred from sensory percepts. We present a hierarchical abstraction approach to uncover these underlying entities and achieve combinatorial generalization from unstructured visual inputs. By constructing a factorized transition graph over clusters of entity representations inferred from pixels, we show how to learn a correspondence between intervening on states of entities in the agent's model and acting on objects in the environment. We use this correspondence to develop a method for control that generalizes to different numbers and configurations of objects, which outperforms current offline deep RL methods when evaluated on simulated rearrangement tasks.

Related papers

A representational framework for learning and encoding structurally enriched trajectories in complex agent environments [1.904851064759821]
The ability of artificial intelligence agents to make optimal decisions and generalise them to different domains and tasks is compromised in complex scenarios. One way to address this issue has focused on learning efficient representations of the world and on how the actions of agents affect them, such as disentangled representations that exploit symmetries. We propose to enrich the agent's ontology and extend the traditionalisation of trajectories to provide a more nuanced view of task execution.
arXiv Detail & Related papers (2025-03-17T14:04:27Z)
EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation [25.12999060040265]
Learning to manipulate objects from high-dimensional observations presents significant challenges. Recent approaches have utilized large-scale offline data to train models from pixel observations. We propose a novel behavioral cloning (BC) approach that leverages object-centric representations and an entity-centric Transformer.
arXiv Detail & Related papers (2024-12-25T13:50:15Z)
Composable Part-Based Manipulation [61.48634521323737]
We propose composable part-based manipulation (CPM) to improve learning and generalization of robotic manipulation skills. CPM comprises a collection of composable diffusion models, where each model captures a different inter-object correspondence. We validate our approach in both simulated and real-world scenarios, demonstrating its effectiveness in achieving robust and generalized manipulation capabilities.
arXiv Detail & Related papers (2024-05-09T16:04:14Z)
Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder. We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets. We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z)
Structure-Regularized Attention for Deformable Object Representation [17.120035855774344]
Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks. Recent approaches that focus on modeling global context, such as self-attention and non-local operation, achieve this goal by enabling unconstrained pairwise interactions between elements. We consider learning representations for deformable objects which can benefit from context exploitation by modeling the structural dependencies that the data intrinsically possesses.
arXiv Detail & Related papers (2021-06-12T03:10:17Z)
End-to-End Hierarchical Relation Extraction for Generic Form Understanding [0.6299766708197884]
We present a novel deep neural network to jointly perform both entity detection and link prediction. Our model extends the Multi-stage Attentional U-Net architecture with the Part-Intensity Fields and Part-Association Fields for link prediction. We demonstrate the effectiveness of the model on the Form Understanding in Noisy Scanned Documents dataset.
arXiv Detail & Related papers (2021-06-02T06:51:35Z)
Hierarchical Pyramid Representations for Semantic Segmentation [0.0]
We learn the structures of objects and the hierarchy among objects because context is based on these intrinsic properties. In this study, we design novel hierarchical, contextual, and multiscale pyramidal representations to capture the properties from an input image. Our proposed method achieves state-of-the-art performance in PASCAL Context.
arXiv Detail & Related papers (2021-04-05T06:39:12Z)
Visual Concept Reasoning Networks [93.99840807973546]
A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks. We propose to exploit this strategy and combine it with our Visual Concept Reasoning Networks (VCRNet) to enable reasoning between high-level visual concepts. Our proposed model, VCRNet, consistently improves the performance by increasing the number of parameters by less than 1%.
arXiv Detail & Related papers (2020-08-26T20:02:40Z)
Efficient State Abstraction using Object-centered Predicates for Manipulation Planning [86.24148040040885]
We propose an object-centered representation that permits characterizing a much wider set of possible changes in configuration spaces. Based on this representation, we define universal planning operators for picking and placing actions that permits generating plans with geometric and force consistency.
arXiv Detail & Related papers (2020-07-16T10:52:53Z)
Disassembling Object Representations without Labels [75.2215716328001]
We study a new representation-learning task, which we termed as disassembling object representations. Disassembling enables category-specific modularity in the learned representations. We propose an unsupervised approach to achieving disassembling, named Unsupervised Disassembling Object Representation (UDOR)
arXiv Detail & Related papers (2020-04-03T08:23:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.