Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery
- URL: http://arxiv.org/abs/2405.17283v3
- Date: Mon, 28 Oct 2024 13:58:44 GMT
- Title: Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery
- Authors: Anand Gopalakrishnan, Aleksandar Stanić, Jürgen Schmidhuber, Michael Curtis Mozer,
- Abstract summary: We argue for the computational advantages of a recurrent architecture with complex-valued weights.
We propose a fully convolutional autoencoder, SynCx, that performs iterative constraint satisfaction.
- Score: 62.43562856605473
- License:
- Abstract: Current state-of-the-art synchrony-based models encode object bindings with complex-valued activations and compute with real-valued weights in feedforward architectures. We argue for the computational advantages of a recurrent architecture with complex-valued weights. We propose a fully convolutional autoencoder, SynCx, that performs iterative constraint satisfaction: at each iteration, a hidden layer bottleneck encodes statistically regular configurations of features in particular phase relationships; over iterations, local constraints propagate and the model converges to a globally consistent configuration of phase assignments. Binding is achieved simply by the matrix-vector product operation between complex-valued weights and activations, without the need for additional mechanisms that have been incorporated into current synchrony-based models. SynCx outperforms or is strongly competitive with current models for unsupervised object discovery. SynCx also avoids certain systematic grouping errors of current models, such as the inability to separate similarly colored objects without additional supervision.
Related papers
- COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks.
We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges.
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z) - CTSyn: A Foundational Model for Cross Tabular Data Generation [9.568990880984813]
Cross-Table Synthesizer (CTSyn) is a diffusion-based foundational model tailored for tabular data generation.
CTSyn significantly outperforms existing table synthesizers in utility and diversity.
It also uniquely enhances performances of downstream machine learning beyond what is achievable with real data.
arXiv Detail & Related papers (2024-06-07T04:04:21Z) - Representation Alignment Contrastive Regularization for Multi-Object Tracking [29.837560662395713]
Mainstream-performance in multi-object tracking algorithms relies on modeling heavily-temporal relationships during the data association stage.
This work aims to simplify deep learning-based,temporal relationship models and introduce interpretability into data association designs.
arXiv Detail & Related papers (2024-04-03T08:33:08Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Contrastive Training of Complex-Valued Autoencoders for Object Discovery [55.280789409319716]
We introduce architectural modifications and a novel contrastive learning method that greatly improve the state-of-the-art synchrony-based model.
For the first time, we obtain a class of synchrony-based models capable of discovering objects in an unsupervised manner in multi-object color datasets.
arXiv Detail & Related papers (2023-05-24T10:37:43Z) - DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained
Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states.
We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs.
Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Model-Based Reinforcement Learning via Stochastic Hybrid Models [39.83837705993256]
This paper adopts a hybrid-system view of nonlinear modeling and control.
We consider a sequence modeling paradigm that captures the temporal structure of the data.
We show that these time-series models naturally admit a closed-loop extension that we use to extract local feedback controllers.
arXiv Detail & Related papers (2021-11-11T14:05:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.