Hierarchical Relational Inference
- URL: http://arxiv.org/abs/2010.03635v2
- Date: Mon, 14 Dec 2020 22:14:23 GMT
- Title: Hierarchical Relational Inference
- Authors: Aleksandar Stani\'c, Sjoerd van Steenkiste, J\"urgen Schmidhuber
- Abstract summary: We propose a novel approach to physical reasoning that models objects as hierarchies of parts that may locally behave separately, but also act more globally as a single whole.
Unlike prior approaches, our method learns in an unsupervised fashion directly from raw visual images.
It explicitly distinguishes multiple levels of abstraction and improves over a strong baseline at modeling synthetic and real-world videos.
- Score: 80.00374471991246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Common-sense physical reasoning in the real world requires learning about the
interactions of objects and their dynamics. The notion of an abstract object,
however, encompasses a wide variety of physical objects that differ greatly in
terms of the complex behaviors they support. To address this, we propose a
novel approach to physical reasoning that models objects as hierarchies of
parts that may locally behave separately, but also act more globally as a
single whole. Unlike prior approaches, our method learns in an unsupervised
fashion directly from raw visual images to discover objects, parts, and their
relations. It explicitly distinguishes multiple levels of abstraction and
improves over a strong baseline at modeling synthetic and real-world videos.
Related papers
- Systematic Visual Reasoning through Object-Centric Relational
Abstraction [5.914610036560008]
We introduce OCRA, a model that extracts explicit representations of both objects and abstract relations.
It achieves strong systematic generalizations in tasks involving complex visual displays.
arXiv Detail & Related papers (2023-06-04T22:47:17Z) - Provably Learning Object-Centric Representations [25.152680199034215]
We analyze when object-centric representations can provably be learned without supervision.
We prove that the ground-truth object representations can be identified by an invertible and compositional inference model.
We provide evidence that our theory holds predictive power for existing object-centric models.
arXiv Detail & Related papers (2023-05-23T16:44:49Z) - Robust and Controllable Object-Centric Learning through Energy-based
Models [95.68748828339059]
ours is a conceptually simple and general approach to learning object-centric representations through an energy-based model.
We show that ours can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations.
arXiv Detail & Related papers (2022-10-11T15:11:15Z) - Self-supervised Neural Articulated Shape and Appearance Models [18.99030452836038]
We propose a novel approach for learning a representation of the geometry, appearance, and motion of a class of articulated objects.
Our representation learns shape, appearance, and articulation codes that enable independent control of these semantic dimensions.
arXiv Detail & Related papers (2022-05-17T17:50:47Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of
Articulated Objects [73.23249640099516]
We learn both the appearance and the structure of previously unseen articulated objects by observing them move from multiple views.
Our insight is that adjacent parts that move relative to each other must be connected by a joint.
We show that our method works for different structures, from quadrupeds, to single-arm robots, to humans.
arXiv Detail & Related papers (2021-12-21T16:37:48Z) - Constellation: Learning relational abstractions over objects for
compositional imagination [64.99658940906917]
We introduce Constellation, a network that learns relational abstractions of static visual scenes.
This work is a first step in the explicit representation of visual relationships and using them for complex cognitive procedures.
arXiv Detail & Related papers (2021-07-23T11:59:40Z) - RELATE: Physically Plausible Multi-Object Scene Synthesis Using
Structured Latent Spaces [77.07767833443256]
We present RELATE, a model that learns to generate physically plausible scenes and videos of multiple interacting objects.
In contrast to state-of-the-art methods in object-centric generative modeling, RELATE also extends naturally to dynamic scenes and generates videos of high visual fidelity.
arXiv Detail & Related papers (2020-07-02T17:27:27Z) - Towards causal generative scene models via competition of experts [26.181132737834826]
We present an alternative approach which uses an inductive bias encouraging modularity by training an ensemble of generative models (experts)
During training, experts compete for explaining parts of a scene, and thus specialise on different object classes, with objects being identified as parts that re-occur across multiple scenes.
Our model allows for controllable sampling of individual objects and recombination of experts in physically plausible ways.
arXiv Detail & Related papers (2020-04-27T16:10:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.