Related papers: A Relational Inductive Bias for Dimensional Abstraction in Neural Networks

A Relational Inductive Bias for Dimensional Abstraction in Neural Networks

URL: http://arxiv.org/abs/2402.18426v1
Date: Wed, 28 Feb 2024 15:51:05 GMT
Title: A Relational Inductive Bias for Dimensional Abstraction in Neural Networks
Authors: Declan Campbell, Jonathan D. Cohen
Abstract summary: This paper investigates the impact of the relational bottleneck on the learning of factorized representations conducive to compositional coding. We demonstrate that such a bottleneck not only improves generalization and learning efficiency, but also aligns network performance with human-like behavioral biases.
Score: 3.5063551678446494
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The human cognitive system exhibits remarkable flexibility and generalization capabilities, partly due to its ability to form low-dimensional, compositional representations of the environment. In contrast, standard neural network architectures often struggle with abstract reasoning tasks, overfitting, and requiring extensive data for training. This paper investigates the impact of the relational bottleneck -- a mechanism that focuses processing on relations among inputs -- on the learning of factorized representations conducive to compositional coding and the attendant flexibility of processing. We demonstrate that such a bottleneck not only improves generalization and learning efficiency, but also aligns network performance with human-like behavioral biases. Networks trained with the relational bottleneck developed orthogonal representations of feature dimensions latent in the dataset, reflecting the factorized structure thought to underlie human cognitive flexibility. Moreover, the relational network mimics human biases towards regularity without pre-specified symbolic primitives, suggesting that the bottleneck fosters the emergence of abstract representations that confer flexibility akin to symbols.

Related papers

Adjustment for Confounding using Pre-Trained Representations [2.916285040262091]
We investigate how latent features from pre-trained neural networks can be leveraged to adjust for sources of confounding.<n>We show that neural networks can achieve fast convergence rates by adapting to intrinsic notions of sparsity and dimension of the learning problem.
arXiv Detail & Related papers (2025-06-17T09:11:17Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Discrete, compositional, and symbolic representations through attractor dynamics [51.20712945239422]
We introduce a novel neural systems model that integrates attractor dynamics with symbolic representations to model cognitive processes akin to the probabilistic language of thought (PLoT) Our model segments the continuous representational space into discrete basins, with attractor states corresponding to symbolic sequences, that reflect the semanticity and compositionality characteristic of symbolic systems through unsupervised learning, rather than relying on pre-defined primitives. This approach establishes a unified framework that integrates both symbolic and sub-symbolic processing through neural dynamics, a neuroplausible substrate with proven expressivity in AI, offering a more comprehensive model that mirrors the complex duality of cognitive operations
arXiv Detail & Related papers (2023-10-03T05:40:56Z)
The Relational Bottleneck as an Inductive Bias for Efficient Abstraction [3.19883356005403]
We show that neural networks are constrained via their architecture to focus on relations between perceptual inputs, rather than the attributes of individual inputs. We review a family of models that employ this approach to induce abstractions in a data-efficient manner.
arXiv Detail & Related papers (2023-09-12T22:44:14Z)
On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet) We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
On the Binding Problem in Artificial Neural Networks [12.04468744445707]
We argue that the underlying cause for this shortcoming is their inability to dynamically and flexibly bind information. We propose a unifying framework that revolves around forming meaningful entities from unstructured sensory inputs. We believe that a compositional approach to AI, in terms of grounded symbol-like representations, is of fundamental importance for realizing human-level generalization.
arXiv Detail & Related papers (2020-12-09T18:02:49Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.