Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness
- URL: http://arxiv.org/abs/2110.10942v1
- Date: Thu, 21 Oct 2021 07:28:11 GMT
- Title: Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness
- Authors: Simon Geisler, Johanna Sommer, Jan Schuchardt, Aleksandar Bojchevski,
Stephan G\"unnemann
- Abstract summary: Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
- Score: 68.97830259849086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: End-to-end (geometric) deep learning has seen first successes in
approximating the solution of combinatorial optimization problems. However,
generating data in the realm of NP-hard/-complete tasks brings practical and
theoretical challenges, resulting in evaluation protocols that are too
optimistic. Specifically, most datasets only capture a simpler subproblem and
likely suffer from spurious features. We investigate these effects by studying
adversarial robustness - a local generalization property - to reveal hard,
model-specific instances and spurious features. For this purpose, we derive
perturbation models for SAT and TSP. Unlike in other applications, where
perturbation models are designed around subjective notions of imperceptibility,
our perturbation models are efficient and sound, allowing us to determine the
true label of perturbed samples without a solver. Surprisingly, with such
perturbations, a sufficiently expressive neural solver does not suffer from the
limitations of the accuracy-robustness trade-off common in supervised learning.
Although such robust solvers exist, we show empirically that the assessed
neural solvers do not generalize well w.r.t. small perturbations of the problem
instance.
Related papers
- Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection [1.0358639819750703]
In unsupervised anomaly detection (UAD) research, it is necessary to develop a computationally efficient and scalable solution.
We revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses.
We propose Feature Attenuation of Defective Representation (FADeR) that only employs two layers which attenuates feature information of anomaly reconstruction.
arXiv Detail & Related papers (2024-07-05T15:44:53Z) - On the ISS Property of the Gradient Flow for Single Hidden-Layer Neural
Networks with Linear Activations [0.0]
We investigate the effects of overfitting on the robustness of gradient-descent training when subject to uncertainty on the gradient estimation.
We show that the general overparametrized formulation introduces a set of spurious equilibria which lay outside the set where the loss function is minimized.
arXiv Detail & Related papers (2023-05-17T02:26:34Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Instance-optimality in optimal value estimation: Adaptivity via
variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions.
Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure.
In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z) - Identifiability-Guaranteed Simplex-Structured Post-Nonlinear Mixture
Learning via Autoencoder [9.769870656657522]
This work focuses on the problem of unraveling nonlinearly mixed latent components in an unsupervised manner.
The latent components are assumed to reside in the probability simplex, and are transformed by an unknown post-nonlinear mixing system.
This problem finds various applications in signal and data analytics, e.g., nonlinear hyperspectral unmixing, image embedding, and nonlinear clustering.
arXiv Detail & Related papers (2021-06-16T18:20:58Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Active Importance Sampling for Variational Objectives Dominated by Rare
Events: Consequences for Optimization and Generalization [12.617078020344618]
We introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions dominated by rare events.
We show that importance sampling reduces the variance of the solution to a learning problem, suggesting benefits for generalization.
Our numerical experiments demonstrate that we can successfully learn even with the compounding difficulties of high-dimensional and rare data.
arXiv Detail & Related papers (2020-08-11T23:38:09Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.