Gaussian Loss Smoothing Enables Certified Training with Tight Convex Relaxations
- URL: http://arxiv.org/abs/2403.07095v3
- Date: Thu, 10 Oct 2024 10:01:24 GMT
- Title: Gaussian Loss Smoothing Enables Certified Training with Tight Convex Relaxations
- Authors: Stefan Balauca, Mark Niklas Müller, Yuhao Mao, Maximilian Baader, Marc Fischer, Martin Vechev,
- Abstract summary: Training neural networks with high certified accuracy against adversarial examples remains an open challenge.
certification methods can effectively leverage tight convex relaxations for bound computation.
In training, these methods can perform worse than looser relaxations.
We show that Gaussian Loss Smoothing can alleviate these issues.
- Score: 14.061189994638667
- License:
- Abstract: Training neural networks with high certified accuracy against adversarial examples remains an open challenge despite significant efforts. While certification methods can effectively leverage tight convex relaxations for bound computation, in training, these methods, perhaps surprisingly, can perform worse than looser relaxations. Prior work hypothesized that this phenomenon is caused by the discontinuity, non-smoothness, and perturbation sensitivity of the loss surface induced by tighter relaxations. In this work, we theoretically show that Gaussian Loss Smoothing (GLS) can alleviate these issues. We confirm this empirically by instantiating GLS with two variants: a zeroth-order optimization algorithm, called PGPE, which allows training with non-differentiable relaxations, and a first-order optimization algorithm, called RGS, which requires gradients of the relaxation but is much more efficient than PGPE. Extensive experiments show that when combined with tight relaxations, these methods surpass state-of-the-art methods when training on the same network architecture for many settings. Our results clearly demonstrate the promise of Gaussian Loss Smoothing for training certifiably robust neural networks and pave a path towards leveraging tighter relaxations for certified training.
Related papers
- The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information [35.34142909458158]
We show that we can leverage curvature information in OBS-like fashion upon the projection step of classic iterative sparse recovery algorithms such as IHT.
We present extensions of this approach to the practical task of obtaining accurate sparses, and validate it experimentally at scale for Transformer-based models on vision and language tasks.
arXiv Detail & Related papers (2024-08-30T10:06:26Z) - Robust Stochastically-Descending Unrolled Networks [85.6993263983062]
Deep unrolling is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network.
We show that convergence guarantees and generalizability of the unrolled networks are still open theoretical problems.
We numerically assess unrolled architectures trained under the proposed constraints in two different applications.
arXiv Detail & Related papers (2023-12-25T18:51:23Z) - Resilient Constrained Learning [94.27081585149836]
This paper presents a constrained learning approach that adapts the requirements while simultaneously solving the learning task.
We call this approach resilient constrained learning after the term used to describe ecological systems that adapt to disruptions by modifying their operation.
arXiv Detail & Related papers (2023-06-04T18:14:18Z) - Towards Scaling Difference Target Propagation by Learning Backprop
Targets [64.90165892557776]
Difference Target Propagation is a biologically-plausible learning algorithm with close relation with Gauss-Newton (GN) optimization.
We propose a novel feedback weight training scheme that ensures both that DTP approximates BP and that layer-wise feedback weight training can be restored.
We report the best performance ever achieved by DTP on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2022-01-31T18:20:43Z) - DeepSplit: Scalable Verification of Deep Neural Networks via Operator
Splitting [70.62923754433461]
Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non- optimization problem.
We propose a novel method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller subproblems that often have analytical solutions.
arXiv Detail & Related papers (2021-06-16T20:43:49Z) - Feature Purification: How Adversarial Training Performs Robust Deep
Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network.
We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z) - Tightened Convex Relaxations for Neural Network Robustness Certification [10.68833097448566]
We exploit the structure of ReLU networks to improve relaxation errors through a novel partition-based certification procedure.
The proposed method is proven to tighten existing linear programming relaxations, and achieves zero relaxation error as the result is made finer.
arXiv Detail & Related papers (2020-04-01T16:59:21Z) - Explicitly Trained Spiking Sparsity in Spiking Neural Networks with
Backpropagation [7.952659059689134]
Spiking Neural Networks (SNNs) are being explored for their potential energy efficiency resulting from sparse, event-driven computations.
We propose an explicit inclusion of spike counts in the loss function, along with a traditional error loss, to optimize weight parameters for both accuracy and spiking sparsity.
arXiv Detail & Related papers (2020-03-02T23:39:18Z) - Improving the Tightness of Convex Relaxation Bounds for Training
Certifiably Robust Classifiers [72.56180590447835]
Convex relaxations are effective for certifying training and neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical robustness.
We propose two experiments that can be used to train neural networks that can be trained in higher certified accuracy than non-regularized baselines.
arXiv Detail & Related papers (2020-02-22T20:19:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.