Can neural networks do arithmetic? A survey on the elementary numerical
skills of state-of-the-art deep learning models
- URL: http://arxiv.org/abs/2303.07735v1
- Date: Tue, 14 Mar 2023 09:30:52 GMT
- Title: Can neural networks do arithmetic? A survey on the elementary numerical
skills of state-of-the-art deep learning models
- Authors: Alberto Testolin
- Abstract summary: It is unclear whether deep learning models possess an elementary understanding of quantities and symbolic numbers.
We critically examine the recent literature, concluding that even state-of-the-art architectures often fall short when probed with relatively simple tasks designed to test basic numerical and arithmetic knowledge.
- Score: 0.424243593213882
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Creating learning models that can exhibit sophisticated reasoning skills is
one of the greatest challenges in deep learning research, and mathematics is
rapidly becoming one of the target domains for assessing scientific progress in
this direction. In the past few years there has been an explosion of neural
network architectures, data sets, and benchmarks specifically designed to
tackle mathematical problems, reporting notable success in disparate fields
such as automated theorem proving, numerical integration, and discovery of new
conjectures or matrix multiplication algorithms. However, despite these
impressive achievements it is still unclear whether deep learning models
possess an elementary understanding of quantities and symbolic numbers. In this
survey we critically examine the recent literature, concluding that even
state-of-the-art architectures often fall short when probed with relatively
simple tasks designed to test basic numerical and arithmetic knowledge.
Related papers
- Learning to solve arithmetic problems with a virtual abacus [0.35911228556176483]
We introduce a deep reinforcement learning framework that allows to simulate how cognitive agents could learn to solve arithmetic problems.
The proposed model successfully learns to perform multi-digit additions and subtractions, achieving an error rate below 1%.
We analyze the most common error patterns to better understand the limitations and biases resulting from our design choices.
arXiv Detail & Related papers (2023-01-17T13:25:52Z) - A Survey of Deep Learning for Mathematical Reasoning [71.88150173381153]
We review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade.
Recent advances in large-scale neural language models have opened up new benchmarks and opportunities to use deep learning for mathematical reasoning.
arXiv Detail & Related papers (2022-12-20T18:46:16Z) - Are Deep Neural Networks SMARTer than Second Graders? [85.60342335636341]
We evaluate the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6--8 age group.
Our dataset consists of 101 unique puzzles; each puzzle comprises a picture question, and their solution needs a mix of several elementary skills, including arithmetic, algebra, and spatial reasoning.
Experiments reveal that while powerful deep models offer reasonable performances on puzzles in a supervised setting, they are not better than random accuracy when analyzed for generalization.
arXiv Detail & Related papers (2022-12-20T04:33:32Z) - Neural Architecture Search for Dense Prediction Tasks in Computer Vision [74.9839082859151]
Deep learning has led to a rising demand for neural network architecture engineering.
neural architecture search (NAS) aims at automatically designing neural network architectures in a data-driven manner rather than manually.
NAS has become applicable to a much wider range of problems in computer vision.
arXiv Detail & Related papers (2022-02-15T08:06:50Z) - Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges [50.22269760171131]
The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods.
This text is concerned with exposing pre-defined regularities through unified geometric principles.
It provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers.
arXiv Detail & Related papers (2021-04-27T21:09:51Z) - Towards a Mathematical Understanding of Neural Network-Based Machine
Learning: what we know and what we don't [11.447492237545788]
This article reviews the achievements made in the last few years towards the understanding of the reasons behind the success and subtleties of neural network-based machine learning.
In the tradition of good old applied mathematics, we will not only give attention to rigorous mathematical results, but also the insight we have gained from careful numerical experiments.
arXiv Detail & Related papers (2020-09-22T17:55:47Z) - Knowledge Distillation: A Survey [87.51063304509067]
Deep neural networks have been successful in both industry and academia, especially for computer vision tasks.
It is a challenge to deploy these cumbersome deep models on devices with limited resources.
Knowledge distillation effectively learns a small student model from a large teacher model.
arXiv Detail & Related papers (2020-06-09T21:47:17Z) - Machine Number Sense: A Dataset of Visual Arithmetic Problems for
Abstract and Relational Reasoning [95.18337034090648]
We propose a dataset, Machine Number Sense (MNS), consisting of visual arithmetic problems automatically generated using a grammar model--And-Or Graph (AOG)
These visual arithmetic problems are in the form of geometric figures.
We benchmark the MNS dataset using four predominant neural network models as baselines in this visual reasoning task.
arXiv Detail & Related papers (2020-04-25T17:14:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.