Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness
- URL: http://arxiv.org/abs/2110.10942v1
- Date: Thu, 21 Oct 2021 07:28:11 GMT
- Title: Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness
- Authors: Simon Geisler, Johanna Sommer, Jan Schuchardt, Aleksandar Bojchevski,
Stephan G\"unnemann
- Abstract summary: Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
- Score: 68.97830259849086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: End-to-end (geometric) deep learning has seen first successes in
approximating the solution of combinatorial optimization problems. However,
generating data in the realm of NP-hard/-complete tasks brings practical and
theoretical challenges, resulting in evaluation protocols that are too
optimistic. Specifically, most datasets only capture a simpler subproblem and
likely suffer from spurious features. We investigate these effects by studying
adversarial robustness - a local generalization property - to reveal hard,
model-specific instances and spurious features. For this purpose, we derive
perturbation models for SAT and TSP. Unlike in other applications, where
perturbation models are designed around subjective notions of imperceptibility,
our perturbation models are efficient and sound, allowing us to determine the
true label of perturbed samples without a solver. Surprisingly, with such
perturbations, a sufficiently expressive neural solver does not suffer from the
limitations of the accuracy-robustness trade-off common in supervised learning.
Although such robust solvers exist, we show empirically that the assessed
neural solvers do not generalize well w.r.t. small perturbations of the problem
instance.
Related papers
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection [1.0358639819750703]
In unsupervised anomaly detection (UAD) research, it is necessary to develop a computationally efficient and scalable solution.
We revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses.
We propose Feature Attenuation of Defective Representation (FADeR) that only employs two layers which attenuates feature information of anomaly reconstruction.
arXiv Detail & Related papers (2024-07-05T15:44:53Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Identifiability-Guaranteed Simplex-Structured Post-Nonlinear Mixture
Learning via Autoencoder [9.769870656657522]
This work focuses on the problem of unraveling nonlinearly mixed latent components in an unsupervised manner.
The latent components are assumed to reside in the probability simplex, and are transformed by an unknown post-nonlinear mixing system.
This problem finds various applications in signal and data analytics, e.g., nonlinear hyperspectral unmixing, image embedding, and nonlinear clustering.
arXiv Detail & Related papers (2021-06-16T18:20:58Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Active Importance Sampling for Variational Objectives Dominated by Rare
Events: Consequences for Optimization and Generalization [12.617078020344618]
We introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions dominated by rare events.
We show that importance sampling reduces the variance of the solution to a learning problem, suggesting benefits for generalization.
Our numerical experiments demonstrate that we can successfully learn even with the compounding difficulties of high-dimensional and rare data.
arXiv Detail & Related papers (2020-08-11T23:38:09Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.