Reconstruction Bottlenecks in Object-Centric Generative Models
- URL: http://arxiv.org/abs/2007.06245v2
- Date: Tue, 24 Nov 2020 13:52:23 GMT
- Title: Reconstruction Bottlenecks in Object-Centric Generative Models
- Authors: Martin Engelcke, Oiwi Parker Jones, Ingmar Posner
- Abstract summary: We investigate the role of "reconstruction bottlenecks" for scene decomposition in GENESIS, a recent VAE-based model.
We show such bottlenecks determine reconstruction and segmentation quality and critically influence model behaviour.
- Score: 24.430685026986524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A range of methods with suitable inductive biases exist to learn
interpretable object-centric representations of images without supervision.
However, these are largely restricted to visually simple images; robust object
discovery in real-world sensory datasets remains elusive. To increase the
understanding of such inductive biases, we empirically investigate the role of
"reconstruction bottlenecks" for scene decomposition in GENESIS, a recent
VAE-based model. We show such bottlenecks determine reconstruction and
segmentation quality and critically influence model behaviour.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Unveiling the Ambiguity in Neural Inverse Rendering: A Parameter Compensation Analysis [36.353019226575576]
Inverse rendering aims to reconstruct the scene properties of objects solely from multiview images.
In this paper, we utilize Neural Microfacet Fields (NMF), a state-of-the-art neural inverse rendering method to illustrate the inherent ambiguity.
arXiv Detail & Related papers (2024-04-19T11:56:29Z) - Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement [58.9768112704998]
Disentangled representation learning strives to extract the intrinsic factors within observed data.
We introduce a new perspective and framework, demonstrating that diffusion models with cross-attention can serve as a powerful inductive bias.
This is the first work to reveal the potent disentanglement capability of diffusion models with cross-attention, requiring no complex designs.
arXiv Detail & Related papers (2024-02-15T05:07:54Z) - Benchmarking and Analysis of Unsupervised Object Segmentation from
Real-world Single Images [6.848868644753519]
We investigate the effectiveness of existing unsupervised models on challenging real-world images.
We find that existing unsupervised models fail to segment generic objects in real-world images.
Our research results suggest that future work should exploit more explicit objectness biases in the network design.
arXiv Detail & Related papers (2023-12-08T10:25:59Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Provably Learning Object-Centric Representations [25.152680199034215]
We analyze when object-centric representations can provably be learned without supervision.
We prove that the ground-truth object representations can be identified by an invertible and compositional inference model.
We provide evidence that our theory holds predictive power for existing object-centric models.
arXiv Detail & Related papers (2023-05-23T16:44:49Z) - Robust and Controllable Object-Centric Learning through Energy-based
Models [95.68748828339059]
ours is a conceptually simple and general approach to learning object-centric representations through an energy-based model.
We show that ours can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations.
arXiv Detail & Related papers (2022-10-11T15:11:15Z) - Promising or Elusive? Unsupervised Object Segmentation from Real-world
Single Images [4.709764624933227]
We investigate the effectiveness of existing unsupervised models on challenging real-world images.
We find that, not surprisingly, existing unsupervised models fail to segment generic objects in real-world images.
Our research results suggest that future work should exploit more explicit objectness biases in the network design.
arXiv Detail & Related papers (2022-10-05T15:22:54Z) - Bridging the Gap to Real-World Object-Centric Learning [66.55867830853803]
We show that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way.
Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data.
arXiv Detail & Related papers (2022-09-29T15:24:47Z) - Inductive Biases for Object-Centric Representations of Complex Textures [13.045904773946367]
We use neural style transfer to generate datasets where objects have complex textures while still retaining ground-truth annotations.
We find that, when a model effectively balances the importance of shape and appearance in the training objective, it can achieve better separation of the objects and learn more useful object representations.
arXiv Detail & Related papers (2022-04-18T17:34:37Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.