Related papers: Abstraction, Reasoning and Deep Learning: A Study of the "Look and Say" Sequence

Abstraction, Reasoning and Deep Learning: A Study of the "Look and Say" Sequence

URL: http://arxiv.org/abs/2109.12755v1
Date: Mon, 27 Sep 2021 01:41:37 GMT
Title: Abstraction, Reasoning and Deep Learning: A Study of the "Look and Say" Sequence
Authors: Wlodek W. Zadrozny
Abstract summary: Deep neural networks can exhibit high competence' (as measured by accuracy) when trained on large data sets. We report on two sets experiments on the Look and Say" puzzle data. Despite the amazing accuracy (on both, training and test data), the performance of the trained programs on the actual L&S sequence is bad.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The ability to abstract, count, and use System 2 reasoning are well-known manifestations of intelligence and understanding. In this paper, we argue, using the example of the ``Look and Say" puzzle, that although deep neural networks can exhibit high `competence' (as measured by accuracy) when trained on large data sets (2M examples in our case), they do not show any sign on the deeper understanding of the problem, or what D. Dennett calls `comprehension'. We report on two sets experiments on the ``Look and Say" puzzle data. We view the problem as building a translator from one set of tokens to another. We apply both standard LSTMs and Transformer/Attention -- based neural networks, using publicly available machine translation software. We observe that despite the amazing accuracy (on both, training and test data), the performance of the trained programs on the actual L\&S sequence is bad. We then discuss a few possible ramifications of this finding and connections to other work, experimental and theoretical. First, from the cognitive science perspective, we argue that we need better mathematical models of abstraction. Second, the classical and more recent results on the universality of neural networks should be re-examined for functions acting on discrete data sets. Mapping on discrete sets usually have no natural continuous extensions. This connects the results on a simple puzzle to more sophisticated results on modeling of mathematical functions, where algebraic functions are more difficult to model than e.g. differential equations. Third, we hypothesize that for problems such as ``Look and Say", computing the parity of bitstrings, or learning integer addition, it might be worthwhile to introduce concepts from topology, where continuity is defined without the reference to the concept of distance.

Related papers

On Logical Extrapolation for Mazes with Recurrent and Implicit Networks [2.0037131645168396]
We show that training neural networks on diverse data addresses some failure modes but, paradoxically, does not improve logical extrapolation.<n>We also analyze convergence behavior, and show that models explicitly trained to converge to a fixed point are likely to do so when extrapolating.
arXiv Detail & Related papers (2024-10-03T22:07:51Z)
Generalization on the Unseen, Logic Reasoning and Degree Curriculum [25.7378861650474]
This paper considers the learning of logical (Boolean) functions with a focus on the generalization on the unseen (GOTU) setting. We study how different network architectures trained by (S)GD perform under GOTU. More specifically, this means an interpolator of the training data that has minimal Fourier mass on the higher degree basis elements.
arXiv Detail & Related papers (2023-01-30T17:44:05Z)
Are Deep Neural Networks SMARTer than Second Graders? [85.60342335636341]
We evaluate the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6--8 age group. Our dataset consists of 101 unique puzzles; each puzzle comprises a picture question, and their solution needs a mix of several elementary skills, including arithmetic, algebra, and spatial reasoning. Experiments reveal that while powerful deep models offer reasonable performances on puzzles in a supervised setting, they are not better than random accuracy when analyzed for generalization.
arXiv Detail & Related papers (2022-12-20T04:33:32Z)
Stop ordering machine learning algorithms by their explainability! A user-centered investigation of performance and explainability [0.0]
We study tradeoff between model performance and explainability of machine learning algorithms. We find that the tradeoff is much less gradual in the end user's perception. Results of our second experiment show that while explainable artificial intelligence augmentations can be used to increase explainability, the type of explanation plays an essential role in end user perception.
arXiv Detail & Related papers (2022-06-20T08:32:38Z)
Emergence of Machine Language: Towards Symbolic Intelligence with Neural Networks [73.94290462239061]
We propose to combine symbolism and connectionism principles by using neural networks to derive a discrete representation. By designing an interactive environment and task, we demonstrated that machines could generate a spontaneous, flexible, and semantic language.
arXiv Detail & Related papers (2022-01-14T14:54:58Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z)
Perspective: A Phase Diagram for Deep Learning unifying Jamming, Feature Learning and Lazy Training [4.318555434063275]
Deep learning algorithms are responsible for a technological revolution in a variety of tasks including image recognition or Go playing. Yet, why they work is not understood. Ultimately, they manage to classify data lying in high dimension -- a feat generically impossible. We argue that different learning regimes can be organized into a phase diagram.
arXiv Detail & Related papers (2020-12-30T11:00:36Z)
Logic Tensor Networks [9.004005678155023]
We present Logic Networks (LTN), a neurosymbolic formalism and computational model that supports learning and reasoning. We show that LTN provides a uniform language for the specification and the computation of several AI tasks.
arXiv Detail & Related papers (2020-12-25T22:30:18Z)
Machine Number Sense: A Dataset of Visual Arithmetic Problems for Abstract and Relational Reasoning [95.18337034090648]
We propose a dataset, Machine Number Sense (MNS), consisting of visual arithmetic problems automatically generated using a grammar model--And-Or Graph (AOG) These visual arithmetic problems are in the form of geometric figures. We benchmark the MNS dataset using four predominant neural network models as baselines in this visual reasoning task.
arXiv Detail & Related papers (2020-04-25T17:14:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.