Enhancing deep neural networks through complex-valued representations and Kuramoto synchronization dynamics
- URL: http://arxiv.org/abs/2502.21077v1
- Date: Fri, 28 Feb 2025 14:10:42 GMT
- Title: Enhancing deep neural networks through complex-valued representations and Kuramoto synchronization dynamics
- Authors: Sabine Muzellec, Andrea Alamia, Thomas Serre, Rufin VanRullen,
- Abstract summary: We investigate whether synchrony-based mechanisms can enhance object encoding in artificial models trained for visual categorization.<n>We combine complex-valued representations with Kuramoto dynamics to promote phase alignment, facilitating the grouping of features belonging to the same object.<n>Our findings highlight the potential of synchrony-driven mechanisms to enhance deep learning models, improving their performance, robustness, and generalization.
- Score: 13.611995923070426
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural synchrony is hypothesized to play a crucial role in how the brain organizes visual scenes into structured representations, enabling the robust encoding of multiple objects within a scene. However, current deep learning models often struggle with object binding, limiting their ability to represent multiple objects effectively. Inspired by neuroscience, we investigate whether synchrony-based mechanisms can enhance object encoding in artificial models trained for visual categorization. Specifically, we combine complex-valued representations with Kuramoto dynamics to promote phase alignment, facilitating the grouping of features belonging to the same object. We evaluate two architectures employing synchrony: a feedforward model and a recurrent model with feedback connections to refine phase synchronization using top-down information. Both models outperform their real-valued counterparts and complex-valued models without Kuramoto synchronization on tasks involving multi-object images, such as overlapping handwritten digits, noisy inputs, and out-of-distribution transformations. Our findings highlight the potential of synchrony-driven mechanisms to enhance deep learning models, improving their performance, robustness, and generalization in complex visual categorization tasks.
Related papers
- Hierarchical Banzhaf Interaction for General Video-Language Representation Learning [60.44337740854767]
Multimodal representation learning plays an important role in the artificial intelligence domain.<n>We introduce a new approach that models video-text as game players using multivariate cooperative game theory.<n>We extend our original structure into a flexible encoder-decoder framework, enabling the model to adapt to various downstream tasks.
arXiv Detail & Related papers (2024-12-30T14:09:15Z) - MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes [35.16430027877207]
MOVIS aims to enhance the structural awareness of the view-conditioned diffusion model for multi-object NVS.<n>We introduce an auxiliary task requiring the model to simultaneously predict novel view object masks.<n>To evaluate the plausibility of synthesized images, we propose to assess cross-view consistency and novel view object placement.
arXiv Detail & Related papers (2024-12-16T05:23:45Z) - Towards Evaluating the Robustness of Visual State Space Models [63.14954591606638]
Vision State Space Models (VSSMs) have demonstrated remarkable performance in visual perception tasks.
However, their robustness under natural and adversarial perturbations remains a critical concern.
We present a comprehensive evaluation of VSSMs' robustness under various perturbation scenarios.
arXiv Detail & Related papers (2024-06-13T17:59:44Z) - Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery [62.43562856605473]
We argue for the computational advantages of a recurrent architecture with complex-valued weights.
We propose a fully convolutional autoencoder, SynCx, that performs iterative constraint satisfaction.
arXiv Detail & Related papers (2024-05-27T15:47:03Z) - Discovering group dynamics in coordinated time series via hierarchical recurrent switching-state models [5.250223406627639]
We present a new switching-state model that can be trained in an unsupervised fashion to learn system-level and individual-level dynamics.<n>We employ a latent system-level discrete state Markov chain that provides topdown influence on latent entity-level chains which in turn govern the emission of each observed time series.
arXiv Detail & Related papers (2024-01-26T16:06:01Z) - Data driven modeling for self-similar dynamics [1.0790314700764785]
We introduce a multiscale neural network framework that incorporates self-similarity as prior knowledge.
For deterministic dynamics, our framework can discern whether the dynamics are self-similar.
Our method can identify the power law exponents in self-similar systems.
arXiv Detail & Related papers (2023-10-12T12:39:08Z) - Robust and Controllable Object-Centric Learning through Energy-based
Models [95.68748828339059]
ours is a conceptually simple and general approach to learning object-centric representations through an energy-based model.
We show that ours can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations.
arXiv Detail & Related papers (2022-10-11T15:11:15Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Structure-Regularized Attention for Deformable Object Representation [17.120035855774344]
Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks.
Recent approaches that focus on modeling global context, such as self-attention and non-local operation, achieve this goal by enabling unconstrained pairwise interactions between elements.
We consider learning representations for deformable objects which can benefit from context exploitation by modeling the structural dependencies that the data intrinsically possesses.
arXiv Detail & Related papers (2021-06-12T03:10:17Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z) - Relational State-Space Model for Stochastic Multi-Object Systems [24.234120525358456]
This paper introduces the relational state-space model (R-SSM), a sequential hierarchical latent variable model.
R-SSM makes use of graph neural networks (GNNs) to simulate the joint state transitions of multiple correlated objects.
The utility of R-SSM is empirically evaluated on synthetic and real time-series datasets.
arXiv Detail & Related papers (2020-01-13T03:45:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.