Abstraction, Reasoning and Deep Learning: A Study of the "Look and Say"
Sequence
- URL: http://arxiv.org/abs/2109.12755v1
- Date: Mon, 27 Sep 2021 01:41:37 GMT
- Title: Abstraction, Reasoning and Deep Learning: A Study of the "Look and Say"
Sequence
- Authors: Wlodek W. Zadrozny
- Abstract summary: Deep neural networks can exhibit high competence' (as measured by accuracy) when trained on large data sets.
We report on two sets experiments on the Look and Say" puzzle data.
Despite the amazing accuracy (on both, training and test data), the performance of the trained programs on the actual L&S sequence is bad.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability to abstract, count, and use System 2 reasoning are well-known
manifestations of intelligence and understanding. In this paper, we argue,
using the example of the ``Look and Say" puzzle, that although deep neural
networks can exhibit high `competence' (as measured by accuracy) when trained
on large data sets (2M examples in our case), they do not show any sign on the
deeper understanding of the problem, or what D. Dennett calls `comprehension'.
We report on two sets experiments on the ``Look and Say" puzzle data. We view
the problem as building a translator from one set of tokens to another. We
apply both standard LSTMs and Transformer/Attention -- based neural networks,
using publicly available machine translation software. We observe that despite
the amazing accuracy (on both, training and test data), the performance of the
trained programs on the actual L\&S sequence is bad. We then discuss a few
possible ramifications of this finding and connections to other work,
experimental and theoretical. First, from the cognitive science perspective, we
argue that we need better mathematical models of abstraction. Second, the
classical and more recent results on the universality of neural networks should
be re-examined for functions acting on discrete data sets. Mapping on discrete
sets usually have no natural continuous extensions. This connects the results
on a simple puzzle to more sophisticated results on modeling of mathematical
functions, where algebraic functions are more difficult to model than e.g.
differential equations. Third, we hypothesize that for problems such as ``Look
and Say", computing the parity of bitstrings, or learning integer addition, it
might be worthwhile to introduce concepts from topology, where continuity is
defined without the reference to the concept of distance.
Related papers
- Utility-Probability Duality of Neural Networks [4.871730595406078]
We propose an alternative utility-based explanation to the standard supervised learning procedure in deep learning.
The basic idea is to interpret the learned neural network not as a probability model but as an ordinal utility function.
We show that for all neural networks with softmax outputs, the SGD learning dynamic of maximum likelihood estimation can be seen as an iteration process.
arXiv Detail & Related papers (2023-05-24T08:09:07Z) - Are Deep Neural Networks SMARTer than Second Graders? [85.60342335636341]
We evaluate the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6--8 age group.
Our dataset consists of 101 unique puzzles; each puzzle comprises a picture question, and their solution needs a mix of several elementary skills, including arithmetic, algebra, and spatial reasoning.
Experiments reveal that while powerful deep models offer reasonable performances on puzzles in a supervised setting, they are not better than random accuracy when analyzed for generalization.
arXiv Detail & Related papers (2022-12-20T04:33:32Z) - Stop ordering machine learning algorithms by their explainability! A
user-centered investigation of performance and explainability [0.0]
We study tradeoff between model performance and explainability of machine learning algorithms.
We find that the tradeoff is much less gradual in the end user's perception.
Results of our second experiment show that while explainable artificial intelligence augmentations can be used to increase explainability, the type of explanation plays an essential role in end user perception.
arXiv Detail & Related papers (2022-06-20T08:32:38Z) - Emergence of Machine Language: Towards Symbolic Intelligence with Neural
Networks [73.94290462239061]
We propose to combine symbolism and connectionism principles by using neural networks to derive a discrete representation.
By designing an interactive environment and task, we demonstrated that machines could generate a spontaneous, flexible, and semantic language.
arXiv Detail & Related papers (2022-01-14T14:54:58Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Recognizing and Verifying Mathematical Equations using Multiplicative
Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence.
Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z) - Perspective: A Phase Diagram for Deep Learning unifying Jamming, Feature
Learning and Lazy Training [4.318555434063275]
Deep learning algorithms are responsible for a technological revolution in a variety of tasks including image recognition or Go playing.
Yet, why they work is not understood. Ultimately, they manage to classify data lying in high dimension -- a feat generically impossible.
We argue that different learning regimes can be organized into a phase diagram.
arXiv Detail & Related papers (2020-12-30T11:00:36Z) - Logic Tensor Networks [9.004005678155023]
We present Logic Networks (LTN), a neurosymbolic formalism and computational model that supports learning and reasoning.
We show that LTN provides a uniform language for the specification and the computation of several AI tasks.
arXiv Detail & Related papers (2020-12-25T22:30:18Z) - Machine Number Sense: A Dataset of Visual Arithmetic Problems for
Abstract and Relational Reasoning [95.18337034090648]
We propose a dataset, Machine Number Sense (MNS), consisting of visual arithmetic problems automatically generated using a grammar model--And-Or Graph (AOG)
These visual arithmetic problems are in the form of geometric figures.
We benchmark the MNS dataset using four predominant neural network models as baselines in this visual reasoning task.
arXiv Detail & Related papers (2020-04-25T17:14:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.