Related papers: Learning with Algorithmic Supervision via Continuous Relaxations

Learning with Algorithmic Supervision via Continuous Relaxations

URL: http://arxiv.org/abs/2110.05651v1
Date: Mon, 11 Oct 2021 23:52:42 GMT
Title: Learning with Algorithmic Supervision via Continuous Relaxations
Authors: Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen
Abstract summary: We propose an approach that allows to integrate algorithms into end-to-end trainable neural network architectures. To obtain meaningful gradients, each relevant variable is perturbed via logistic distributions. We evaluate the proposed continuous relaxation model on four challenging tasks and show that it can keep up with relaxations specifically designed for each individual task.
Score: 19.437400671428737
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The integration of algorithmic components into neural architectures has gained increased attention recently, as it allows training neural networks with new forms of supervision such as ordering constraints or silhouettes instead of using ground truth labels. Many approaches in the field focus on the continuous relaxation of a specific task and show promising results in this context. But the focus on single tasks also limits the applicability of the proposed concepts to a narrow range of applications. In this work, we build on those ideas to propose an approach that allows to integrate algorithms into end-to-end trainable neural network architectures based on a general approximation of discrete conditions. To this end, we relax these conditions in control structures such as conditional statements, loops, and indexing, so that resulting algorithms are smoothly differentiable. To obtain meaningful gradients, each relevant variable is perturbed via logistic distributions and the expectation value under this perturbation is approximated. We evaluate the proposed continuous relaxation model on four challenging tasks and show that it can keep up with relaxations specifically designed for each individual task.

Related papers

Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints [49.76332265680669]
This paper examines a crucial subset of problems where both the objective and constraint functions are weakly convex. Existing methods often face limitations, including slow convergence rates or reliance on double-loop designs. We introduce a novel single-loop penalty-based algorithm to overcome these challenges.
arXiv Detail & Related papers (2025-04-21T17:15:48Z)
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE [68.6018458996143]
We propose a more general dynamic network that can combine both quantization and early exit dynamic network: QuEE. Our algorithm can be seen as a form of soft early exiting or input-dependent compression. The crucial factor of our approach is accurate prediction of the potential accuracy improvement achievable through further computation.
arXiv Detail & Related papers (2024-06-20T15:25:13Z)
Discrete Neural Algorithmic Reasoning [18.497863598167257]
We propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states. trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm.
arXiv Detail & Related papers (2024-02-18T16:03:04Z)
Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference. Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions. We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z)
Robust Stochastically-Descending Unrolled Networks [85.6993263983062]
Deep unrolling is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network. We show that convergence guarantees and generalizability of the unrolled networks are still open theoretical problems. We numerically assess unrolled architectures trained under the proposed constraints in two different applications.
arXiv Detail & Related papers (2023-12-25T18:51:23Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective. We show how to compute this efficiently for tractable circuits. We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z)
Batch Active Learning from the Perspective of Sparse Approximation [12.51958241746014]
Active learning enables efficient model training by leveraging interactions between machine learning agents and human annotators. We study and propose a novel framework that formulates batch active learning from the sparse approximation's perspective. Our active learning method aims to find an informative subset from the unlabeled data pool such that the corresponding training loss function approximates its full data pool counterpart.
arXiv Detail & Related papers (2022-11-01T03:20:28Z)
Learning for Integer-Constrained Optimization through Neural Networks with Limited Training [28.588195947764188]
We introduce a symmetric and decomposed neural network structure, which is fully interpretable in terms of the functionality of its constituent components. By taking advantage of the underlying pattern of the integer constraint, the introduced neural network offers superior generalization performance with limited training. We show that the introduced decomposed approach can be further extended to semi-decomposed frameworks.
arXiv Detail & Related papers (2020-11-10T21:17:07Z)
Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data. We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z)
A Shooting Formulation of Deep Learning [19.51427735087011]
We introduce a shooting formulation which shifts the perspective from parameterizing a network layer-by-layer to parameterizing over optimal networks. For scalability, we propose a novel particle-ensemble parametrization which fully specifies the optimal weight trajectory of the continuous-depth neural network.
arXiv Detail & Related papers (2020-06-18T07:36:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.