Related papers: Quantifying the Preferential Direction of the Model Gradient in Adversarial Training With Projected Gradient Descent

Quantifying the Preferential Direction of the Model Gradient in Adversarial Training With Projected Gradient Descent

URL: http://arxiv.org/abs/2009.04709v5
Date: Thu, 20 Apr 2023 02:03:18 GMT
Title: Quantifying the Preferential Direction of the Model Gradient in Adversarial Training With Projected Gradient Descent
Authors: Ricardo Bigolin Lanfredi, Joyce D. Schroeder, Tolga Tasdizen
Abstract summary: After adversarial training, gradients of models with respect to their inputs have a preferential direction. We propose a novel definition of this direction as the direction of the vector pointing toward the closest point of the support of the closest inaccurate class in decision space. We show that our metric presents higher alignment values than a competing metric formulation, and that enforcing this alignment increases the robustness of models.
Score: 4.8035104863603575
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial training, especially projected gradient descent (PGD), has proven to be a successful approach for improving robustness against adversarial attacks. After adversarial training, gradients of models with respect to their inputs have a preferential direction. However, the direction of alignment is not mathematically well established, making it difficult to evaluate quantitatively. We propose a novel definition of this direction as the direction of the vector pointing toward the closest point of the support of the closest inaccurate class in decision space. To evaluate the alignment with this direction after adversarial training, we apply a metric that uses generative adversarial networks to produce the smallest residual needed to change the class present in the image. We show that PGD-trained models have a higher alignment than the baseline according to our definition, that our metric presents higher alignment values than a competing metric formulation, and that enforcing this alignment increases the robustness of models.

Related papers

Directional Gradient Projection for Robust Fine-Tuning of Foundation Models [25.04763038570959]
Directional Gradient Projection (DiGraP) is a layer-wise trainable method that incorporates directional information from gradients to bridge regularization and multi-objective optimization. We first bridge the uni-modal and multi-modal gap by performing analysis on Image Classification reformulated Visual Question Answering (VQA) benchmarks. Experimental results show that DiGraP consistently outperforms existing baselines across Image Classfication and VQA tasks with discriminative and generative backbones.
arXiv Detail & Related papers (2025-02-21T19:31:55Z)
Refining Alignment Framework for Diffusion Models with Intermediate-Step Preference Ranking [50.325021634589596]
We propose a Tailored Optimization Preference (TailorPO) framework for aligning diffusion models with human preference. Our approach directly ranks intermediate noisy samples based on their step-wise reward, and effectively resolves the gradient direction issues. Experimental results demonstrate that our method significantly improves the model's ability to generate aesthetically pleasing and human-preferred images.
arXiv Detail & Related papers (2025-02-01T16:08:43Z)
TART: Boosting Clean Accuracy Through Tangent Direction Guided Adversarial Training [7.931280949498884]
Adversarial training has been shown to be successful in enhancing the robustness of deep neural networks against adversarial attacks. However, this robustness is accompanied by a significant decline in accuracy on clean data. We propose a novel method, called Tangent Direction Guided Adversarial Training (TART), that leverages the tangent space of the data manifold to ameliorate the existing adversarial defense algorithms.
arXiv Detail & Related papers (2024-08-27T01:41:21Z)
Towards Robust and Interpretable EMG-based Hand Gesture Recognition using Deep Metric Meta Learning [37.21211404608413]
We propose a shift to deep metric-based meta-learning in EMG PR to supervise the creation of meaningful and interpretable representations. We derive a robust class proximity-based confidence estimator that leads to a better rejection of incorrect decisions.
arXiv Detail & Related papers (2024-04-17T23:37:50Z)
Bi-discriminator Domain Adversarial Neural Networks with Class-Level Gradient Alignment [87.8301166955305]
We propose a novel bi-discriminator domain adversarial neural network with class-level gradient alignment. BACG resorts to gradient signals and second-order probability estimation for better alignment of domain distributions. In addition, inspired by contrastive learning, we develop a memory bank-based variant, i.e. Fast-BACG, which can greatly shorten the training process.
arXiv Detail & Related papers (2023-10-21T09:53:17Z)
Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature [81.25511385257344]
We present a novel solution, Q-REG, which utilizes rich geometric information to estimate the rigid pose from a single correspondence. Q-REG allows to formalize the robust estimation as an exhaustive search, hence enabling end-to-end training. We demonstrate in the experiments that Q-REG is agnostic to the correspondence matching method and provides consistent improvement both when used only in inference and in end-to-end training.
arXiv Detail & Related papers (2023-09-27T20:58:53Z)
Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks [35.6604960300194]
This work examines the challenges of training neural networks using vector quantization using straight-through estimation. We find that a primary cause of training instability is the discrepancy between the model embedding and the code-vector distribution. We identify the factors that contribute to this issue, including the codebook gradient sparsity and the asymmetric nature of the commitment loss.
arXiv Detail & Related papers (2023-05-15T17:56:36Z)
Improving Adversarial Transferability via Intermediate-level Perturbation Decay [79.07074710460012]
We develop a novel intermediate-level method that crafts adversarial examples within a single stage of optimization. Experimental results show that it outperforms state-of-the-arts by large margins in attacking various victim models.
arXiv Detail & Related papers (2023-04-26T09:49:55Z)
Do Perceptually Aligned Gradients Imply Adversarial Robustness? [17.929524924008962]
Adversarially robust classifiers possess a trait that non-robust models do not -- Perceptually Aligned Gradients (PAG) Several works have identified PAG as a byproduct of robust training, but none have considered it as a standalone phenomenon nor studied its own implications. We show that better gradient alignment leads to increased robustness and harness this observation to boost the robustness of existing adversarial training techniques.
arXiv Detail & Related papers (2022-07-22T23:48:26Z)
CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement [52.41884119329864]
Category-level object pose and size refiner CATRE is able to iteratively enhance pose estimate from point clouds to produce accurate results. Our approach remarkably outperforms state-of-the-art methods on REAL275, CAMERA25, and LM benchmarks up to a speed of 85.32Hz.
arXiv Detail & Related papers (2022-07-17T05:55:00Z)
Ranking Distance Calibration for Cross-Domain Few-Shot Learning [91.22458739205766]
Recent progress in few-shot learning promotes a more realistic cross-domain setting. Due to the domain gap and disjoint label spaces between source and target datasets, their shared knowledge is extremely limited. We employ a re-ranking process for calibrating a target distance matrix by discovering the reciprocal k-nearest neighbours within the task.
arXiv Detail & Related papers (2021-12-01T03:36:58Z)
Geometry-aware Instance-reweighted Adversarial Training [78.70024866515756]
In adversarial machine learning, there was a common belief that robustness and accuracy hurt each other. We propose geometry-aware instance-reweighted adversarial training, where the weights are based on how difficult it is to attack a natural data point. Experiments show that our proposal boosts the robustness of standard adversarial training.
arXiv Detail & Related papers (2020-10-05T01:33:11Z)
Self-adaptive Re-weighted Adversarial Domain Adaptation [12.73753413032972]
We present a self-adaptive re-weighted adversarial domain adaptation approach. It tries to enhance domain alignment from the perspective of conditional distribution. Empirical evidence demonstrates that the proposed model outperforms state of the arts on standard domain adaptation datasets.
arXiv Detail & Related papers (2020-05-30T08:35:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.