Discrete Neural Algorithmic Reasoning
- URL: http://arxiv.org/abs/2402.11628v2
- Date: Mon, 16 Sep 2024 16:22:40 GMT
- Title: Discrete Neural Algorithmic Reasoning
- Authors: Gleb Rodionov, Liudmila Prokhorenkova,
- Abstract summary: We propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states.
trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm.
- Score: 18.497863598167257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural algorithmic reasoning aims to capture computations with neural networks via learning the models to imitate the execution of classic algorithms. While common architectures are expressive enough to contain the correct model in the weights space, current neural reasoners are struggling to generalize well on out-of-distribution data. On the other hand, classic computations are not affected by distributional shifts as they can be described as transitions between discrete computational states. In this work, we propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states. To achieve that, we separate discrete and continuous data flows and describe the interaction between them. Trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm. To show this, we evaluate our approach on multiple algorithmic problems and get perfect test scores both in single-task and multitask setups. Moreover, the proposed architectural choice allows us to prove the correctness of the learned algorithms for any test~data.
Related papers
- Deep Equilibrium Algorithmic Reasoning [18.651333116786084]
We study neurally solving algorithms from a different perspective.
Since the algorithm's solution is often an equilibrium, it is possible to find the solution directly by solving an equilibrium equation.
Our approach requires no information on the ground-truth number of steps of the algorithm, both during train and test time.
arXiv Detail & Related papers (2024-10-19T10:40:55Z) - Neural Algorithmic Reasoning Without Intermediate Supervision [21.852775399735005]
We focus on learning neural algorithmic reasoning only from the input-output pairs without appealing to the intermediate supervision.
We build a self-supervised objective that can regularise intermediate computations of the model without access to the algorithm trajectory.
We demonstrate that our approach is competitive to its trajectory-supervised counterpart on tasks from the CLRSic Algorithmic Reasoning Benchmark.
arXiv Detail & Related papers (2023-06-23T09:57:44Z) - Neural Algorithmic Reasoning with Causal Regularisation [18.299363749150093]
We make an important observation: there are many different inputs for which an algorithm will perform certain intermediate computations identically.
This insight allows us to develop data augmentation procedures that, given an algorithm's intermediate trajectory, produce inputs for which the target algorithm would have exactly the same next trajectory step.
We prove that the resulting method, which we call Hint-ReLIC, improves the OOD generalisation capabilities of the reasoner.
arXiv Detail & Related papers (2023-02-20T19:41:15Z) - Learning with Differentiable Algorithms [6.47243430672461]
This thesis explores combining classic algorithms and machine learning systems like neural networks.
The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm.
In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable sorting gates, and differentiable logic gate networks.
arXiv Detail & Related papers (2022-09-01T17:30:00Z) - On the Convergence of Distributed Stochastic Bilevel Optimization
Algorithms over a Network [55.56019538079826]
Bilevel optimization has been applied to a wide variety of machine learning models.
Most existing algorithms restrict their single-machine setting so that they are incapable of handling distributed data.
We develop novel decentralized bilevel optimization algorithms based on a gradient tracking communication mechanism and two different gradients.
arXiv Detail & Related papers (2022-06-30T05:29:52Z) - Scalable computation of prediction intervals for neural networks via
matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure.
This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z) - Benchmarking Simulation-Based Inference [5.3898004059026325]
Recent advances in probabilistic modelling have led to a large number of simulation-based inference algorithms which do not require numerical evaluation of likelihoods.
We provide a benchmark with inference tasks and suitable performance metrics, with an initial selection of algorithms.
We found that the choice of performance metric is critical, that even state-of-the-art algorithms have substantial room for improvement, and that sequential estimation improves sample efficiency.
arXiv Detail & Related papers (2021-01-12T18:31:22Z) - A Constraint-Based Algorithm for the Structural Learning of
Continuous-Time Bayesian Networks [70.88503833248159]
We propose the first constraint-based algorithm for learning the structure of continuous-time Bayesian networks.
We discuss the different statistical tests and the underlying hypotheses used by our proposal to establish conditional independence.
arXiv Detail & Related papers (2020-07-07T07:34:09Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Parallelization Techniques for Verifying Neural Networks [52.917845265248744]
We introduce an algorithm based on the verification problem in an iterative manner and explore two partitioning strategies.
We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems.
arXiv Detail & Related papers (2020-04-17T20:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.