Related papers: GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks

GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks

URL: http://arxiv.org/abs/2302.02907v2
Date: Thu, 25 May 2023 08:01:05 GMT
Title: GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks
Authors: Salah Ghamizi, Jingfeng Zhang, Maxime Cordy, Mike Papadakis, Masashi Sugiyama, and Yves Le Traon
Abstract summary: We propose a novel adversarial training technique that exploits auxiliary tasks under a limited set of training data. Our approach extends single-task models into multi-task models during the min-max optimization of adversarial training. We demonstrate that guided multi-task learning is an actionable and promising avenue to push further the boundaries of model robustness.
Score: 73.88590165742721
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: While leveraging additional training data is well established to improve adversarial robustness, it incurs the unavoidable cost of data collection and the heavy computation to train models. To mitigate the costs, we propose Guided Adversarial Training (GAT), a novel adversarial training technique that exploits auxiliary tasks under a limited set of training data. Our approach extends single-task models into multi-task models during the min-max optimization of adversarial training, and drives the loss optimization with a regularization of the gradient curvature across multiple tasks. GAT leverages two types of auxiliary tasks: self-supervised tasks, where the labels are generated automatically, and domain-knowledge tasks, where human experts provide additional labels. Experimentally, GAT increases the robust AUC of CheXpert medical imaging dataset from 50% to 83% and On CIFAR-10, GAT outperforms eight state-of-the-art adversarial training and achieves 56.21% robust accuracy with Resnet-50. Overall, we demonstrate that guided multi-task learning is an actionable and promising avenue to push further the boundaries of model robustness.

Related papers

Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails. We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses. C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z)
Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences. We formulate each task as a sequence-to-sequence problem and perform multi-task training. We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z)
Reducing Adversarial Training Cost with Gradient Approximation [0.3916094706589679]
We propose a new and efficient adversarial training method, adversarial training with gradient approximation (GAAT) to reduce the cost of building up robust models. Our proposed method saves up to 60% of the training time with comparable model test accuracy on datasets.
arXiv Detail & Related papers (2023-09-18T03:55:41Z)
Instruction Tuned Models are Quick Learners [20.771930945083994]
In this work, we demonstrate the sample efficiency of instruction tuned models over various tasks. In the STL setting, instruction tuned models equipped with 25% of the downstream train data surpass the SOTA performance on the downstream tasks. In the MTL setting, an instruction tuned model trained on only 6% of downstream training data achieve SOTA, while using 100% of the training data results in a 3.69% points improvement.
arXiv Detail & Related papers (2023-05-17T22:30:01Z)
Efficient Utilization of Large Pre-Trained Models for Low Resource ASR [31.57758062484189]
We study a challenging low resource conversational telephony speech corpus from the medical domain in Vietnamese and German. We show the benefits of using unsupervised techniques beyond simple fine-tuning of large pre-trained models.
arXiv Detail & Related papers (2022-10-26T17:34:30Z)
DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV [65.07776277630228]
We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF) Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs. We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
arXiv Detail & Related papers (2022-08-04T04:35:53Z)
Curriculum Learning: A Regularization Method for Efficient and Stable Billion-Scale GPT Model Pre-Training [18.640076155697415]
We present a study of a curriculum learning based approach, which helps improve the pre-training convergence speed of autoregressive models. Our evaluations demonstrate that curriculum learning enables training GPT-2 models with 8x larger batch size and 4x larger learning rate.
arXiv Detail & Related papers (2021-08-13T06:32:53Z)
Self-Progressing Robust Training [146.8337017922058]
Current robust training methods such as adversarial training explicitly uses an "attack" to generate adversarial examples. We propose a new framework called SPROUT, self-progressing robust training. Our results shed new light on scalable, effective and attack-independent robust training methods.
arXiv Detail & Related papers (2020-12-22T00:45:24Z)
Efficient Adversarial Training with Transferable Adversarial Examples [58.62766224452761]
We show that there is high transferability between models from neighboring epochs in the same training process. We propose a novel method, Adversarial Training with Transferable Adversarial Examples (ATTA) that can enhance the robustness of trained models.
arXiv Detail & Related papers (2019-12-27T03:05:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.