Structured World Belief for Reinforcement Learning in POMDP
- URL: http://arxiv.org/abs/2107.08577v1
- Date: Mon, 19 Jul 2021 01:47:53 GMT
- Title: Structured World Belief for Reinforcement Learning in POMDP
- Authors: Gautam Singh, Skand Peri, Junghyun Kim, Hyunseok Kim, Sungjin Ahn
- Abstract summary: We propose Structured World Belief, a model for learning and inference of object-centric belief states.
In experiments, we show that object-centric belief provides a more accurate and robust performance for filtering and generation.
- Score: 16.5072728047837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object-centric world models provide structured representation of the scene
and can be an important backbone in reinforcement learning and planning.
However, existing approaches suffer in partially-observable environments due to
the lack of belief states. In this paper, we propose Structured World Belief, a
model for learning and inference of object-centric belief states. Inferred by
Sequential Monte Carlo (SMC), our belief states provide multiple object-centric
scene hypotheses. To synergize the benefits of SMC particles with object
representations, we also propose a new object-centric dynamics model that
considers the inductive bias of object permanence. This enables tracking of
object states even when they are invisible for a long time. To further
facilitate object tracking in this regime, we allow our model to attend
flexibly to any spatial location in the image which was restricted in previous
models. In experiments, we show that object-centric belief provides a more
accurate and robust performance for filtering and generation. Furthermore, we
show the efficacy of structured world belief in improving the performance of
reinforcement learning, planning and supervised reasoning.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Unsupervised Dynamics Prediction with Object-Centric Kinematics [22.119612406160073]
We propose Object-Centric Kinematics (OCK), a framework for dynamics prediction leveraging object-centric representations.
OCK consists of low-level structured states of objects' position, velocity, and acceleration.
Our model demonstrates superior performance when handling objects and backgrounds in complex scenes characterized by a wide range of object attributes and dynamic movements.
arXiv Detail & Related papers (2024-04-29T04:47:23Z) - Provably Learning Object-Centric Representations [25.152680199034215]
We analyze when object-centric representations can provably be learned without supervision.
We prove that the ground-truth object representations can be identified by an invertible and compositional inference model.
We provide evidence that our theory holds predictive power for existing object-centric models.
arXiv Detail & Related papers (2023-05-23T16:44:49Z) - Robust and Controllable Object-Centric Learning through Energy-based
Models [95.68748828339059]
ours is a conceptually simple and general approach to learning object-centric representations through an energy-based model.
We show that ours can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations.
arXiv Detail & Related papers (2022-10-11T15:11:15Z) - Bridging the Gap to Real-World Object-Centric Learning [66.55867830853803]
We show that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way.
Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data.
arXiv Detail & Related papers (2022-09-29T15:24:47Z) - Suspected Object Matters: Rethinking Model's Prediction for One-stage
Visual Grounding [93.82542533426766]
We propose a Suspected Object Transformation mechanism (SOT) to encourage the target object selection among the suspected ones.
SOT can be seamlessly integrated into existing CNN and Transformer-based one-stage visual grounders.
Extensive experiments demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2022-03-10T06:41:07Z) - Look-into-Object: Self-supervised Structure Modeling for Object
Recognition [71.68524003173219]
We propose to "look into object" (explicitly yet intrinsically model the object structure) through incorporating self-supervisions.
We show the recognition backbone can be substantially enhanced for more robust representation learning.
Our approach achieves large performance gain on a number of benchmarks, including generic object recognition (ImageNet) and fine-grained object recognition tasks (CUB, Cars, Aircraft)
arXiv Detail & Related papers (2020-03-31T12:22:51Z) - Global-Local Bidirectional Reasoning for Unsupervised Representation
Learning of 3D Point Clouds [109.0016923028653]
We learn point cloud representation by bidirectional reasoning between the local structures and the global shape without human supervision.
We show that our unsupervised model surpasses the state-of-the-art supervised methods on both synthetic and real-world 3D object classification datasets.
arXiv Detail & Related papers (2020-03-29T08:26:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.