Related papers: Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery

Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery

URL: http://arxiv.org/abs/2405.17283v3
Date: Mon, 28 Oct 2024 13:58:44 GMT
Title: Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery
Authors: Anand Gopalakrishnan, Aleksandar Stanić, Jürgen Schmidhuber, Michael Curtis Mozer,
Abstract summary: We argue for the computational advantages of a recurrent architecture with complex-valued weights. We propose a fully convolutional autoencoder, SynCx, that performs iterative constraint satisfaction.
Score: 62.43562856605473
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current state-of-the-art synchrony-based models encode object bindings with complex-valued activations and compute with real-valued weights in feedforward architectures. We argue for the computational advantages of a recurrent architecture with complex-valued weights. We propose a fully convolutional autoencoder, SynCx, that performs iterative constraint satisfaction: at each iteration, a hidden layer bottleneck encodes statistically regular configurations of features in particular phase relationships; over iterations, local constraints propagate and the model converges to a globally consistent configuration of phase assignments. Binding is achieved simply by the matrix-vector product operation between complex-valued weights and activations, without the need for additional mechanisms that have been incorporated into current synchrony-based models. SynCx outperforms or is strongly competitive with current models for unsupervised object discovery. SynCx also avoids certain systematic grouping errors of current models, such as the inability to separate similarly colored objects without additional supervision.

Related papers

Enhancing deep neural networks through complex-valued representations and Kuramoto synchronization dynamics [13.611995923070426]
We investigate whether synchrony-based mechanisms can enhance object encoding in artificial models trained for visual categorization. We combine complex-valued representations with Kuramoto dynamics to promote phase alignment, facilitating the grouping of features belonging to the same object. Our findings highlight the potential of synchrony-driven mechanisms to enhance deep learning models, improving their performance, robustness, and generalization.
arXiv Detail & Related papers (2025-02-28T14:10:42Z)
COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks. We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges. Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z)
CTSyn: A Foundational Model for Cross Tabular Data Generation [9.568990880984813]
Cross-Table Synthesizer (CTSyn) is a diffusion-based foundational model tailored for tabular data generation. CTSyn significantly outperforms existing table synthesizers in utility and diversity. It also uniquely enhances performances of downstream machine learning beyond what is achievable with real data.
arXiv Detail & Related papers (2024-06-07T04:04:21Z)
Representation Alignment Contrastive Regularization for Multi-Object Tracking [29.837560662395713]
Mainstream-performance in multi-object tracking algorithms relies on modeling heavily-temporal relationships during the data association stage. This work aims to simplify deep learning-based,temporal relationship models and introduce interpretability into data association designs.
arXiv Detail & Related papers (2024-04-03T08:33:08Z)
Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space. We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z)
Contrastive Training of Complex-Valued Autoencoders for Object Discovery [55.280789409319716]
We introduce architectural modifications and a novel contrastive learning method that greatly improve the state-of-the-art synchrony-based model. For the first time, we obtain a class of synchrony-based models capable of discovering objects in an unsupervised manner in multi-object color datasets.
arXiv Detail & Related papers (2023-05-24T10:37:43Z)
DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z)
Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder. We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets. We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z)
Model-Based Reinforcement Learning via Stochastic Hybrid Models [39.83837705993256]
This paper adopts a hybrid-system view of nonlinear modeling and control. We consider a sequence modeling paradigm that captures the temporal structure of the data. We show that these time-series models naturally admit a closed-loop extension that we use to extract local feedback controllers.
arXiv Detail & Related papers (2021-11-11T14:05:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.