Compositional Processing Emerges in Neural Networks Solving Math
Problems
- URL: http://arxiv.org/abs/2105.08961v1
- Date: Wed, 19 May 2021 07:24:42 GMT
- Title: Compositional Processing Emerges in Neural Networks Solving Math
Problems
- Authors: Jacob Russin, Roland Fernandez, Hamid Palangi, Eric Rosen, Nebojsa
Jojic, Paul Smolensky, Jianfeng Gao
- Abstract summary: Recent progress in artificial neural networks has shown that when large models are trained on enough linguistic data, grammatical structure emerges in their representations.
We extend this work to the domain of mathematical reasoning, where it is possible to formulate precise hypotheses about how meanings should be composed.
Our work shows that neural networks are not only able to infer something about the structured relationships implicit in their training data, but can also deploy this knowledge to guide the composition of individual meanings into composite wholes.
- Score: 100.80518350845668
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A longstanding question in cognitive science concerns the learning mechanisms
underlying compositionality in human cognition. Humans can infer the structured
relationships (e.g., grammatical rules) implicit in their sensory observations
(e.g., auditory speech), and use this knowledge to guide the composition of
simpler meanings into complex wholes. Recent progress in artificial neural
networks has shown that when large models are trained on enough linguistic
data, grammatical structure emerges in their representations. We extend this
work to the domain of mathematical reasoning, where it is possible to formulate
precise hypotheses about how meanings (e.g., the quantities corresponding to
numerals) should be composed according to structured rules (e.g., order of
operations). Our work shows that neural networks are not only able to infer
something about the structured relationships implicit in their training data,
but can also deploy this knowledge to guide the composition of individual
meanings into composite wholes.
Related papers
- A Complexity-Based Theory of Compositionality [53.025566128892066]
In AI, compositional representations can enable a powerful form of out-of-distribution generalization.
Here, we propose a formal definition of compositionality that accounts for and extends our intuitions about compositionality.
The definition is conceptually simple, quantitative, grounded in algorithmic information theory, and applicable to any representation.
arXiv Detail & Related papers (2024-10-18T18:37:27Z) - From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks [0.0]
We review recent empirical work from machine learning for a broad audience in philosophy, cognitive science, and neuroscience.
In particular, our review emphasizes two approaches to endowing neural networks with compositional generalization capabilities.
We conclude by discussing the implications that these findings may have for the study of compositionality in human cognition.
arXiv Detail & Related papers (2024-05-24T02:36:07Z) - Improving Compositional Generalization Using Iterated Learning and
Simplicial Embeddings [19.667133565610087]
Compositional generalization is easy for humans but hard for deep neural networks.
We propose to improve this ability by using iterated learning on models with simplicial embeddings.
We show that this combination of changes improves compositional generalization over other approaches.
arXiv Detail & Related papers (2023-10-28T18:30:30Z) - How Do Transformers Learn Topic Structure: Towards a Mechanistic
Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure"
We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z) - Benchmarking Compositionality with Formal Languages [64.09083307778951]
We investigate whether large neural models in NLP can acquire the ability tocombining primitive concepts into larger novel combinations while learning from data.
By randomly sampling over many transducers, we explore which of their properties contribute to learnability of a compositional relation by a neural network.
We find that the models either learn the relations completely or not at all. The key is transition coverage, setting a soft learnability limit at 400 examples per transition.
arXiv Detail & Related papers (2022-08-17T10:03:18Z) - Modelling Compositionality and Structure Dependence in Natural Language [0.12183405753834563]
Drawing on linguistics and set theory, a formalisation of these ideas is presented in the first half of this thesis.
We see how cognitive systems that process language need to have certain functional constraints.
Using the advances of word embedding techniques, a model of relational learning is simulated.
arXiv Detail & Related papers (2020-11-22T17:28:50Z) - Understanding understanding: a renormalization group inspired model of
(artificial) intelligence [0.0]
This paper is about the meaning of understanding in scientific and in artificial intelligent systems.
We give a mathematical definition of the understanding, where, contrary to the common wisdom, we define the probability space on the input set.
We show, how scientific understanding fits into this framework, and demonstrate, what is the difference between a scientific task and pattern recognition.
arXiv Detail & Related papers (2020-10-26T11:11:46Z) - Compositional Networks Enable Systematic Generalization for Grounded
Language Understanding [21.481360281719006]
Humans are remarkably flexible when understanding new sentences that include combinations of concepts they have never encountered before.
Recent work has shown that while deep networks can mimic some human language abilities when presented with novel sentences, systematic variation uncovers the limitations in the language-understanding abilities of networks.
We demonstrate that these limitations can be overcome by addressing the generalization challenges in the gSCAN dataset.
arXiv Detail & Related papers (2020-08-06T16:17:35Z) - Compositional Explanations of Neurons [52.71742655312625]
We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts.
We use this procedure to answer several questions on interpretability in models for vision and natural language processing.
arXiv Detail & Related papers (2020-06-24T20:37:05Z) - Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization.
Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.