Related papers: Break It Down: Evidence for Structural Compositionality in Neural Networks

Break It Down: Evidence for Structural Compositionality in Neural Networks

URL: http://arxiv.org/abs/2301.10884v2
Date: Mon, 6 Nov 2023 19:25:02 GMT
Title: Break It Down: Evidence for Structural Compositionality in Neural Networks
Authors: Michael A. Lepori, Thomas Serre, Ellie Pavlick
Abstract summary: We show that neural networks can learn compositionality, obviating the need for specialized symbolic mechanisms. This suggests that neural networks may be able to learn compositionality, obviating the need for specialized symbolic mechanisms.
Score: 32.382094867951224
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Though modern neural networks have achieved impressive performance in both vision and language tasks, we know little about the functions that they implement. One possibility is that neural networks implicitly break down complex tasks into subroutines, implement modular solutions to these subroutines, and compose them into an overall solution to a task - a property we term structural compositionality. Another possibility is that they may simply learn to match new inputs to learned templates, eliding task decomposition entirely. Here, we leverage model pruning techniques to investigate this question in both vision and language across a variety of architectures, tasks, and pretraining regimens. Our results demonstrate that models often implement solutions to subroutines via modular subnetworks, which can be ablated while maintaining the functionality of other subnetworks. This suggests that neural networks may be able to learn compositionality, obviating the need for specialized symbolic mechanisms.

Related papers

Spatial embedding promotes a specific form of modularity with low entropy and heterogeneous spectral dynamics [0.0]
Spatially embedded recurrent neural networks provide a promising avenue to study how modelled constraints shape the combined structural and functional organisation of networks over learning. We show that it is possible to study these restrictions through entropic measures of the neural weights and eigenspectrum, across both rate and spiking neural networks. This work deepens our understanding of constrained learning in neural networks, across coding schemes and tasks, where solutions to simultaneous structural and functional objectives must be accomplished in tandem.
arXiv Detail & Related papers (2024-09-26T10:00:05Z)
Breaking Neural Network Scaling Laws with Modularity [8.482423139660153]
We show how the amount of training data required to generalize varies with the intrinsic dimensionality of a task's input. We then develop a novel learning rule for modular networks to exploit this advantage.
arXiv Detail & Related papers (2024-09-09T16:43:09Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training. It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby. It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z)
The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex. We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z)
Can Transformers Learn to Solve Problems Recursively? [9.5623664764386]
This paper examines the behavior of neural networks learning algorithms relevant to programs and formal verification. By reconstructing these algorithms, we are able to correctly predict 91 percent of failure cases for one of the approximated functions.
arXiv Detail & Related papers (2023-05-24T04:08:37Z)
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks [12.130628846129973]
We introduce the Gated Deep Linear Network framework that schematizes how pathways of information flow impact learning dynamics. We derive an exact reduction and, for certain cases, exact solutions to the dynamics of learning. Our work gives rise to general hypotheses relating neural architecture to learning and provides a mathematical approach towards understanding the design of more complex architectures.
arXiv Detail & Related papers (2022-07-21T12:01:03Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks [10.0444013205203]
Understanding if and how NNs are modular could provide insights into how to improve them. Current inspection methods, however, fail to link modules to their functionality.
arXiv Detail & Related papers (2020-10-05T15:04:11Z)
Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks. First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts. Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z)
Automated Search for Resource-Efficient Branched Multi-Task Networks [81.48051635183916]
We propose a principled approach, rooted in differentiable neural architecture search, to automatically define branching structures in a multi-task neural network. We show that our approach consistently finds high-performing branching structures within limited resource budgets.
arXiv Detail & Related papers (2020-08-24T09:49:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.