Learning with Differentiable Algorithms
- URL: http://arxiv.org/abs/2209.00616v1
- Date: Thu, 1 Sep 2022 17:30:00 GMT
- Title: Learning with Differentiable Algorithms
- Authors: Felix Petersen
- Abstract summary: This thesis explores combining classic algorithms and machine learning systems like neural networks.
The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm.
In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable sorting gates, and differentiable logic gate networks.
- Score: 6.47243430672461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classic algorithms and machine learning systems like neural networks are both
abundant in everyday life. While classic computer science algorithms are
suitable for precise execution of exactly defined tasks such as finding the
shortest path in a large graph, neural networks allow learning from data to
predict the most likely answer in more complex tasks such as image
classification, which cannot be reduced to an exact algorithm. To get the best
of both worlds, this thesis explores combining both concepts leading to more
robust, better performing, more interpretable, more computationally efficient,
and more data efficient architectures. The thesis formalizes the idea of
algorithmic supervision, which allows a neural network to learn from or in
conjunction with an algorithm. When integrating an algorithm into a neural
architecture, it is important that the algorithm is differentiable such that
the architecture can be trained end-to-end and gradients can be propagated back
through the algorithm in a meaningful way. To make algorithms differentiable,
this thesis proposes a general method for continuously relaxing algorithms by
perturbing variables and approximating the expectation value in closed form,
i.e., without sampling. In addition, this thesis proposes differentiable
algorithms, such as differentiable sorting networks, differentiable renderers,
and differentiable logic gate networks. Finally, this thesis presents
alternative training strategies for learning with algorithms.
Related papers
- Discrete Neural Algorithmic Reasoning [18.497863598167257]
We propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states.
trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm.
arXiv Detail & Related papers (2024-02-18T16:03:04Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Dual Algorithmic Reasoning [9.701208207491879]
We propose to learn algorithms by exploiting duality of the underlying algorithmic problem.
We demonstrate that simultaneously learning the dual definition of these optimisation problems in algorithmic learning allows for better learning.
We then validate the real-world utility of our dual algorithmic reasoner by deploying it on a challenging brain vessel classification task.
arXiv Detail & Related papers (2023-02-09T08:46:23Z) - A Generalist Neural Algorithmic Learner [18.425083543441776]
We build a single graph neural network processor capable of learning to execute a wide range of algorithms.
We show that it is possible to effectively learn algorithms in a multi-task manner, so long as we can learn to execute them well in a single-task regime.
arXiv Detail & Related papers (2022-09-22T16:41:33Z) - On the Convergence of Distributed Stochastic Bilevel Optimization
Algorithms over a Network [55.56019538079826]
Bilevel optimization has been applied to a wide variety of machine learning models.
Most existing algorithms restrict their single-machine setting so that they are incapable of handling distributed data.
We develop novel decentralized bilevel optimization algorithms based on a gradient tracking communication mechanism and two different gradients.
arXiv Detail & Related papers (2022-06-30T05:29:52Z) - The CLRS Algorithmic Reasoning Benchmark [28.789225199559834]
Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms.
We propose the CLRS Algorithmic Reasoning Benchmark, covering classical algorithms from the Introduction to Algorithms textbook.
Our benchmark spans a variety of algorithmic reasoning procedures, including sorting, searching, dynamic programming, graph algorithms, string algorithms and geometric algorithms.
arXiv Detail & Related papers (2022-05-31T09:56:44Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Learning to Stop While Learning to Predict [85.7136203122784]
Many algorithm-inspired deep models are restricted to a fixed-depth'' for all inputs.
Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances.
In this paper, we tackle this varying depth problem using a steerable architecture.
We show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks.
arXiv Detail & Related papers (2020-06-09T07:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.