Unsupervised Object-Centric Learning from Multiple Unspecified
Viewpoints
- URL: http://arxiv.org/abs/2401.01922v1
- Date: Wed, 3 Jan 2024 15:09:25 GMT
- Title: Unsupervised Object-Centric Learning from Multiple Unspecified
Viewpoints
- Authors: Jinyang Yuan, Tonglin Chen, Zhimeng Shen, Bin Li, Xiangyang Xue
- Abstract summary: We consider a novel problem of learning compositional scene representations from multiple unspecified viewpoints without using any supervision.
We propose a deep generative model which separates latent representations into a viewpoint-independent part and a viewpoint-dependent part to solve this problem.
Experiments on several specifically designed synthetic datasets have shown that the proposed method can effectively learn from multiple unspecified viewpoints.
- Score: 45.88397367354284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual scenes are extremely diverse, not only because there are infinite
possible combinations of objects and backgrounds but also because the
observations of the same scene may vary greatly with the change of viewpoints.
When observing a multi-object visual scene from multiple viewpoints, humans can
perceive the scene compositionally from each viewpoint while achieving the
so-called ``object constancy'' across different viewpoints, even though the
exact viewpoints are untold. This ability is essential for humans to identify
the same object while moving and to learn from vision efficiently. It is
intriguing to design models that have a similar ability. In this paper, we
consider a novel problem of learning compositional scene representations from
multiple unspecified (i.e., unknown and unrelated) viewpoints without using any
supervision and propose a deep generative model which separates latent
representations into a viewpoint-independent part and a viewpoint-dependent
part to solve this problem. During the inference, latent representations are
randomly initialized and iteratively updated by integrating the information in
different viewpoints with neural networks. Experiments on several specifically
designed synthetic datasets have shown that the proposed method can effectively
learn from multiple unspecified viewpoints.
Related papers
- AnyDoor: Zero-shot Object-level Image Customization [63.44307304097742]
This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations.
Our model is trained only once and effortlessly generalizes to diverse object-scene combinations at the inference stage.
arXiv Detail & Related papers (2023-07-18T17:59:02Z) - A Computational Account Of Self-Supervised Visual Learning From
Egocentric Object Play [3.486683381782259]
We study how learning signals that equate different viewpoints can support robust visual learning.
We find that representations learned by equating different physical viewpoints of an object benefit downstream image classification accuracy.
arXiv Detail & Related papers (2023-05-30T22:42:03Z) - Time-Conditioned Generative Modeling of Object-Centric Representations
for Video Decomposition and Prediction [4.79974591281424]
We introduce a time-conditioned generative model for videos.
We show that the model can make object-centric video decomposition, reconstruct the complete shapes of occluded objects, and make novel-view predictions.
arXiv Detail & Related papers (2023-01-21T13:39:39Z) - Matching Multiple Perspectives for Efficient Representation Learning [0.0]
We present an approach that combines self-supervised learning with a multi-perspective matching technique.
We show that the availability of multiple views of the same object combined with a variety of self-supervised pretraining algorithms can lead to improved object classification performance.
arXiv Detail & Related papers (2022-08-16T10:33:13Z) - Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - Unsupervised Learning of Compositional Scene Representations from
Multiple Unspecified Viewpoints [41.07379505694274]
We consider a novel problem of learning compositional scene representations from multiple unspecified viewpoints without using any supervision.
We propose a deep generative model which separates latent representations into a viewpoint-independent part and a viewpoint-dependent part to solve this problem.
Experiments on several specifically designed synthetic datasets have shown that the proposed method is able to effectively learn from multiple unspecified viewpoints.
arXiv Detail & Related papers (2021-12-07T08:45:21Z) - Learning Object-Centric Representations of Multi-Object Scenes from
Multiple Views [9.556376932449187]
Multi-View and Multi-Object Network (MulMON) is a method for learning accurate, object-centric representations of multi-object scenes by leveraging multiple views.
We show that MulMON better-resolves spatial ambiguities than single-view methods.
arXiv Detail & Related papers (2021-11-13T13:54:28Z) - Recognizing Actions in Videos from Unseen Viewpoints [80.6338404141284]
We show that current convolutional neural network models are unable to recognize actions from camera viewpoints not present in training data.
We introduce a new dataset for unseen view recognition and show the approaches ability to learn viewpoint invariant representations.
arXiv Detail & Related papers (2021-03-30T17:17:54Z) - Exploit Clues from Views: Self-Supervised and Regularized Learning for
Multiview Object Recognition [66.87417785210772]
This work investigates the problem of multiview self-supervised learning (MV-SSL)
A novel surrogate task for self-supervised learning is proposed by pursuing "object invariant" representation.
Experiments shows that the recognition and retrieval results using view invariant prototype embedding (VISPE) outperform other self-supervised learning methods.
arXiv Detail & Related papers (2020-03-28T07:06:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.