Discrete Neural Algorithmic Reasoning
- URL: http://arxiv.org/abs/2402.11628v1
- Date: Sun, 18 Feb 2024 16:03:04 GMT
- Title: Discrete Neural Algorithmic Reasoning
- Authors: Gleb Rodionov, Liudmila Prokhorenkova
- Abstract summary: We propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states.
We get perfect test scores on the SALSA-CLRS benchmark, where we get perfect test scores for all tasks.
- Score: 21.852775399735005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural algorithmic reasoning aims to capture computations with neural
networks via learning the models to imitate the execution of classical
algorithms. While common architectures are expressive enough to contain the
correct model in the weights space, current neural reasoners are struggling to
generalize well on out-of-distribution data. On the other hand, classical
computations are not affected by distribution shifts as they can be described
as transitions between discrete computational states. In this work, we propose
to force neural reasoners to maintain the execution trajectory as a combination
of finite predefined states. Trained with supervision on the algorithm's state
transitions, such models are able to perfectly align with the original
algorithm. To show this, we evaluate our approach on the SALSA-CLRS benchmark,
where we get perfect test scores for all tasks. Moreover, the proposed
architectural choice allows us to prove the correctness of the learned
algorithms for any test data.
Related papers
- Deep Equilibrium Algorithmic Reasoning [18.651333116786084]
We study neurally solving algorithms from a different perspective.
Since the algorithm's solution is often an equilibrium, it is possible to find the solution directly by solving an equilibrium equation.
Our approach requires no information on the ground-truth number of steps of the algorithm, both during train and test time.
arXiv Detail & Related papers (2024-10-19T10:40:55Z) - Neural Algorithmic Reasoning Without Intermediate Supervision [21.852775399735005]
We focus on learning neural algorithmic reasoning only from the input-output pairs without appealing to the intermediate supervision.
We build a self-supervised objective that can regularise intermediate computations of the model without access to the algorithm trajectory.
We demonstrate that our approach is competitive to its trajectory-supervised counterpart on tasks from the CLRSic Algorithmic Reasoning Benchmark.
arXiv Detail & Related papers (2023-06-23T09:57:44Z) - Neural Algorithmic Reasoning with Causal Regularisation [18.299363749150093]
We make an important observation: there are many different inputs for which an algorithm will perform certain intermediate computations identically.
This insight allows us to develop data augmentation procedures that, given an algorithm's intermediate trajectory, produce inputs for which the target algorithm would have exactly the same next trajectory step.
We prove that the resulting method, which we call Hint-ReLIC, improves the OOD generalisation capabilities of the reasoner.
arXiv Detail & Related papers (2023-02-20T19:41:15Z) - Learning with Differentiable Algorithms [6.47243430672461]
This thesis explores combining classic algorithms and machine learning systems like neural networks.
The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm.
In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable sorting gates, and differentiable logic gate networks.
arXiv Detail & Related papers (2022-09-01T17:30:00Z) - On the Convergence of Distributed Stochastic Bilevel Optimization
Algorithms over a Network [55.56019538079826]
Bilevel optimization has been applied to a wide variety of machine learning models.
Most existing algorithms restrict their single-machine setting so that they are incapable of handling distributed data.
We develop novel decentralized bilevel optimization algorithms based on a gradient tracking communication mechanism and two different gradients.
arXiv Detail & Related papers (2022-06-30T05:29:52Z) - Scalable computation of prediction intervals for neural networks via
matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure.
This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z) - Benchmarking Simulation-Based Inference [5.3898004059026325]
Recent advances in probabilistic modelling have led to a large number of simulation-based inference algorithms which do not require numerical evaluation of likelihoods.
We provide a benchmark with inference tasks and suitable performance metrics, with an initial selection of algorithms.
We found that the choice of performance metric is critical, that even state-of-the-art algorithms have substantial room for improvement, and that sequential estimation improves sample efficiency.
arXiv Detail & Related papers (2021-01-12T18:31:22Z) - A Constraint-Based Algorithm for the Structural Learning of
Continuous-Time Bayesian Networks [70.88503833248159]
We propose the first constraint-based algorithm for learning the structure of continuous-time Bayesian networks.
We discuss the different statistical tests and the underlying hypotheses used by our proposal to establish conditional independence.
arXiv Detail & Related papers (2020-07-07T07:34:09Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Parallelization Techniques for Verifying Neural Networks [52.917845265248744]
We introduce an algorithm based on the verification problem in an iterative manner and explore two partitioning strategies.
We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems.
arXiv Detail & Related papers (2020-04-17T20:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.