Related papers: Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms

Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms

URL: http://arxiv.org/abs/2209.00735v1
Date: Thu, 1 Sep 2022 21:55:22 GMT
Title: Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms
Authors: Surbhi Goel, Sham Kakade, Adam Tauman Kalai, Cyril Zhang
Abstract summary: We show a NN architecture that learns as well as any efficient learning algorithm describable by a constant-sized learning algorithm. Our architecture combines both recurrent weight-sharing between layers and convolutional weight-sharing to reduce the number of parameters down to a constant. While in practice the constants in our analysis are too large to be directly meaningful, our work suggests that the synergy of Recurrent and Convolutional NNs may be more powerful than either alone.
Score: 25.1675203905385
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural Networks (NNs) struggle to efficiently learn certain problems, such as parity problems, even when there are simple learning algorithms for those problems. Can NNs discover learning algorithms on their own? We exhibit a NN architecture that, in polynomial time, learns as well as any efficient learning algorithm describable by a constant-sized learning algorithm. For example, on parity problems, the NN learns as well as row reduction, an efficient algorithm that can be succinctly described. Our architecture combines both recurrent weight-sharing between layers and convolutional weight-sharing to reduce the number of parameters down to a constant, even though the network itself may have trillions of nodes. While in practice the constants in our analysis are too large to be directly meaningful, our work suggests that the synergy of Recurrent and Convolutional NNs (RCNNs) may be more powerful than either alone.

Related papers

Learning the Exact Time Integration Algorithm for Initial Value Problems by Randomized Neural Networks [0.0]
We present a method leveraging extreme learning machine (ELM) type randomized neural networks (NNs) for learning the exact time integration algorithm for initial value problems (IVPs) The learned NN algorithm produces highly accurate solutions in long-time simulations, with its time-marching errors decreasing nearly exponentially with increasing degrees of freedom in the neural network.
arXiv Detail & Related papers (2025-02-16T01:53:52Z)
The Deep Equilibrium Algorithmic Reasoner [20.375241527453447]
We show that graph neural networks (GNNs) can learn to execute classical algorithms. We conjecture and empirically validate that one can train a network to solve algorithmic problems by directly finding the equilibrium.
arXiv Detail & Related papers (2024-02-09T14:46:50Z)
NN-Steiner: A Mixed Neural-algorithmic Approach for the Rectilinear Steiner Minimum Tree Problem [5.107107601277712]
We focus on the rectilinear Steiner minimum tree (RSMT) problem, which is of critical importance in IC layout design. We propose NN-Steiner, which is a novel mixed neural-algorithmic framework for computing RSMTs. In particular, NN-Steiner only needs four neural network (NN) components that are called repeatedly within an algorithmic framework.
arXiv Detail & Related papers (2023-12-17T02:42:11Z)
The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex. We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z)
Neural Algorithmic Reasoning for Combinatorial Optimisation [20.36694807847833]
We propose leveraging recent advancements in neural reasoning to improve the learning of CO problems. We suggest pre-training our neural model on relevant algorithms before training it on CO instances. Our results demonstrate that by using this learning setup, we achieve superior performance compared to non-algorithmically informed deep learning models.
arXiv Detail & Related papers (2023-05-18T13:59:02Z)
Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime. We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks [47.54459795966417]
We show that recurrent networks trained to solve simple problems can indeed solve much more complex problems simply by performing additional recurrences during inference. In all three domains, networks trained on simple problem instances are able to extend their reasoning abilities at test time simply by "thinking for longer"
arXiv Detail & Related papers (2021-06-08T17:19:48Z)
Reinforcement Learning with External Knowledge by using Logical Neural Networks [67.46162586940905]
A recent neuro-symbolic framework called the Logical Neural Networks (LNNs) can simultaneously provide key-properties of both neural networks and symbolic logic. We propose an integrated method that enables model-free reinforcement learning from external knowledge sources.
arXiv Detail & Related papers (2021-03-03T12:34:59Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Artificial Neural Network Pruning to Extract Knowledge [0.0]
This paper lists the basic NN simplification problems and controlled pruning procedures to solve these problems. The developed procedures find the optimal structure of NN for each task, measure the influence of each input signal and NN parameter, and provide a detailed verbal description of the algorithms and skills of NN.
arXiv Detail & Related papers (2020-05-13T12:24:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.