Related papers: Neuro-algorithmic Policies enable Fast Combinatorial Generalization

Neuro-algorithmic Policies enable Fast Combinatorial Generalization

URL: http://arxiv.org/abs/2102.07456v1
Date: Mon, 15 Feb 2021 11:07:59 GMT
Title: Neuro-algorithmic Policies enable Fast Combinatorial Generalization
Authors: Marin Vlastelica, Michal Rol\'inek and Georg Martius
Abstract summary: Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We show that for a certain subclass of the MDP framework, this can be alleviated by neuro-algorithmic architectures. We introduce a neuro-algorithmic policy architecture consisting of a neural network and an embedded time-dependent shortest path solver.
Score: 16.74322664734553
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. Furthermore, we show that for a certain subclass of the MDP framework, this can be alleviated by neuro-algorithmic architectures. Many control problems require long-term planning that is hard to solve generically with neural networks alone. We introduce a neuro-algorithmic policy architecture consisting of a neural network and an embedded time-dependent shortest path solver. These policies can be trained end-to-end by blackbox differentiation. We show that this type of architecture generalizes well to unseen variations in the environment already after seeing a few examples.

Related papers

A Neural Rewriting System to Solve Algorithmic Problems [47.129504708849446]
We propose a modular architecture designed to learn a general procedure for solving nested mathematical formulas. Inspired by rewriting systems, a classic framework in symbolic artificial intelligence, we include in the architecture three specialized and interacting modules. We benchmark our system against the Neural Data Router, a recent model specialized for systematic generalization, and a state-of-the-art large language model (GPT-4) probed with advanced prompting strategies.
arXiv Detail & Related papers (2024-02-27T10:57:07Z)
Generalization and Estimation Error Bounds for Model-based Neural Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks. We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z)
Multilevel-in-Layer Training for Deep Neural Network Regression [1.6185544531149159]
We present a multilevel regularization strategy that constructs and trains a hierarchy of neural networks. We experimentally show with PDE regression problems that our multilevel training approach is an effective regularizer.
arXiv Detail & Related papers (2022-11-11T23:53:46Z)
Neural Networks and the Chomsky Hierarchy [27.470857324448136]
We study whether insights from the theory of Chomsky can predict the limits of neural network generalization in practice. We show negative results where even extensive amounts of data and training time never led to any non-trivial generalization. Our results show that, for our subset of tasks, RNNs and Transformers fail to generalize on non-regular tasks, and only networks augmented with structured memory can successfully generalize on context-free and context-sensitive tasks.
arXiv Detail & Related papers (2022-07-05T15:06:11Z)
Polynomial-Spline Neural Networks with Exact Integrals [0.0]
We develop a novel neural network architecture that combines a mixture-of-experts model with free knot B1-spline basis functions. Our architecture exhibits both $h$- and $p$- refinement for regression problems at the convergence rates expected from approximation theory. We demonstrate the success of our network on a range of regression and variational problems that illustrate the consistency and exact integrability of our network architecture.
arXiv Detail & Related papers (2021-10-26T22:12:37Z)
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features. We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
Wide Network Learning with Differential Privacy [7.453881927237143]
Current generation of neural networks suffers significant loss accuracy under most practically relevant privacy training regimes. We develop a general approach towards training these models that takes advantage of the sparsity of the gradients of private Empirical Minimization (ERM) Following the same number of parameters, we propose a novel algorithm for privately training neural networks.
arXiv Detail & Related papers (2021-03-01T20:31:50Z)
Automated Search for Resource-Efficient Branched Multi-Task Networks [81.48051635183916]
We propose a principled approach, rooted in differentiable neural architecture search, to automatically define branching structures in a multi-task neural network. We show that our approach consistently finds high-performing branching structures within limited resource budgets.
arXiv Detail & Related papers (2020-08-24T09:49:19Z)
Neural Complexity Measures [96.06344259626127]
We propose Neural Complexity (NC), a meta-learning framework for predicting generalization. Our model learns a scalar complexity measure through interactions with many heterogeneous tasks in a data-driven way.
arXiv Detail & Related papers (2020-08-07T02:12:10Z)
Multipole Graph Neural Operator for Parametric Partial Differential Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data. We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity. Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.