Quantifying the Preferential Direction of the Model Gradient in
Adversarial Training With Projected Gradient Descent
- URL: http://arxiv.org/abs/2009.04709v5
- Date: Thu, 20 Apr 2023 02:03:18 GMT
- Title: Quantifying the Preferential Direction of the Model Gradient in
Adversarial Training With Projected Gradient Descent
- Authors: Ricardo Bigolin Lanfredi, Joyce D. Schroeder, Tolga Tasdizen
- Abstract summary: After adversarial training, gradients of models with respect to their inputs have a preferential direction.
We propose a novel definition of this direction as the direction of the vector pointing toward the closest point of the support of the closest inaccurate class in decision space.
We show that our metric presents higher alignment values than a competing metric formulation, and that enforcing this alignment increases the robustness of models.
- Score: 4.8035104863603575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training, especially projected gradient descent (PGD), has proven
to be a successful approach for improving robustness against adversarial
attacks. After adversarial training, gradients of models with respect to their
inputs have a preferential direction. However, the direction of alignment is
not mathematically well established, making it difficult to evaluate
quantitatively. We propose a novel definition of this direction as the
direction of the vector pointing toward the closest point of the support of the
closest inaccurate class in decision space. To evaluate the alignment with this
direction after adversarial training, we apply a metric that uses generative
adversarial networks to produce the smallest residual needed to change the
class present in the image. We show that PGD-trained models have a higher
alignment than the baseline according to our definition, that our metric
presents higher alignment values than a competing metric formulation, and that
enforcing this alignment increases the robustness of models.
Related papers
- Refining Alignment Framework for Diffusion Models with Intermediate-Step Preference Ranking [50.325021634589596]
We propose a Tailored Optimization Preference (TailorPO) framework for aligning diffusion models with human preference.
Our approach directly ranks intermediate noisy samples based on their step-wise reward, and effectively resolves the gradient direction issues.
Experimental results demonstrate that our method significantly improves the model's ability to generate aesthetically pleasing and human-preferred images.
arXiv Detail & Related papers (2025-02-01T16:08:43Z) - TART: Boosting Clean Accuracy Through Tangent Direction Guided Adversarial Training [7.931280949498884]
Adversarial training has been shown to be successful in enhancing the robustness of deep neural networks against adversarial attacks.
However, this robustness is accompanied by a significant decline in accuracy on clean data.
We propose a novel method, called Tangent Direction Guided Adversarial Training (TART), that leverages the tangent space of the data manifold to ameliorate the existing adversarial defense algorithms.
arXiv Detail & Related papers (2024-08-27T01:41:21Z) - Towards Robust and Interpretable EMG-based Hand Gesture Recognition using Deep Metric Meta Learning [37.21211404608413]
We propose a shift to deep metric-based meta-learning in EMG PR to supervise the creation of meaningful and interpretable representations.
We derive a robust class proximity-based confidence estimator that leads to a better rejection of incorrect decisions.
arXiv Detail & Related papers (2024-04-17T23:37:50Z) - Bi-discriminator Domain Adversarial Neural Networks with Class-Level
Gradient Alignment [87.8301166955305]
We propose a novel bi-discriminator domain adversarial neural network with class-level gradient alignment.
BACG resorts to gradient signals and second-order probability estimation for better alignment of domain distributions.
In addition, inspired by contrastive learning, we develop a memory bank-based variant, i.e. Fast-BACG, which can greatly shorten the training process.
arXiv Detail & Related papers (2023-10-21T09:53:17Z) - Q-REG: End-to-End Trainable Point Cloud Registration with Surface
Curvature [81.25511385257344]
We present a novel solution, Q-REG, which utilizes rich geometric information to estimate the rigid pose from a single correspondence.
Q-REG allows to formalize the robust estimation as an exhaustive search, hence enabling end-to-end training.
We demonstrate in the experiments that Q-REG is agnostic to the correspondence matching method and provides consistent improvement both when used only in inference and in end-to-end training.
arXiv Detail & Related papers (2023-09-27T20:58:53Z) - Straightening Out the Straight-Through Estimator: Overcoming
Optimization Challenges in Vector Quantized Networks [35.6604960300194]
This work examines the challenges of training neural networks using vector quantization using straight-through estimation.
We find that a primary cause of training instability is the discrepancy between the model embedding and the code-vector distribution.
We identify the factors that contribute to this issue, including the codebook gradient sparsity and the asymmetric nature of the commitment loss.
arXiv Detail & Related papers (2023-05-15T17:56:36Z) - Do Perceptually Aligned Gradients Imply Adversarial Robustness? [17.929524924008962]
Adversarially robust classifiers possess a trait that non-robust models do not -- Perceptually Aligned Gradients (PAG)
Several works have identified PAG as a byproduct of robust training, but none have considered it as a standalone phenomenon nor studied its own implications.
We show that better gradient alignment leads to increased robustness and harness this observation to boost the robustness of existing adversarial training techniques.
arXiv Detail & Related papers (2022-07-22T23:48:26Z) - CATRE: Iterative Point Clouds Alignment for Category-level Object Pose
Refinement [52.41884119329864]
Category-level object pose and size refiner CATRE is able to iteratively enhance pose estimate from point clouds to produce accurate results.
Our approach remarkably outperforms state-of-the-art methods on REAL275, CAMERA25, and LM benchmarks up to a speed of 85.32Hz.
arXiv Detail & Related papers (2022-07-17T05:55:00Z) - Ranking Distance Calibration for Cross-Domain Few-Shot Learning [91.22458739205766]
Recent progress in few-shot learning promotes a more realistic cross-domain setting.
Due to the domain gap and disjoint label spaces between source and target datasets, their shared knowledge is extremely limited.
We employ a re-ranking process for calibrating a target distance matrix by discovering the reciprocal k-nearest neighbours within the task.
arXiv Detail & Related papers (2021-12-01T03:36:58Z) - Geometry-aware Instance-reweighted Adversarial Training [78.70024866515756]
In adversarial machine learning, there was a common belief that robustness and accuracy hurt each other.
We propose geometry-aware instance-reweighted adversarial training, where the weights are based on how difficult it is to attack a natural data point.
Experiments show that our proposal boosts the robustness of standard adversarial training.
arXiv Detail & Related papers (2020-10-05T01:33:11Z) - Self-adaptive Re-weighted Adversarial Domain Adaptation [12.73753413032972]
We present a self-adaptive re-weighted adversarial domain adaptation approach.
It tries to enhance domain alignment from the perspective of conditional distribution.
Empirical evidence demonstrates that the proposed model outperforms state of the arts on standard domain adaptation datasets.
arXiv Detail & Related papers (2020-05-30T08:35:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.