MiMu: Mitigating Multiple Shortcut Learning Behavior of Transformers
- URL: http://arxiv.org/abs/2504.10551v1
- Date: Mon, 14 Apr 2025 08:11:09 GMT
- Title: MiMu: Mitigating Multiple Shortcut Learning Behavior of Transformers
- Authors: Lili Zhao, Qi Liu, Wei Chen, Liyi Chen, Ruijun Sun, Min Hou, Yang Wang, Shijin Wang,
- Abstract summary: Empirical Risk Minimization (ERM) models often rely on spurious correlations between features and labels during the learning process.<n>We propose MiMu, a novel method integrated with Transformer-based generalizations designed to Mitigate Multiple shortcut learning behavior.
- Score: 19.27328009299697
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Empirical Risk Minimization (ERM) models often rely on spurious correlations between features and labels during the learning process, leading to shortcut learning behavior that undermines robustness generalization performance. Current research mainly targets identifying or mitigating a single shortcut; however, in real-world scenarios, cues within the data are diverse and unknown. In empirical studies, we reveal that the models rely to varying extents on different shortcuts. Compared to weak shortcuts, models depend more heavily on strong shortcuts, resulting in their poor generalization ability. To address these challenges, we propose MiMu, a novel method integrated with Transformer-based ERMs designed to Mitigate Multiple shortcut learning behavior, which incorporates self-calibration strategy and self-improvement strategy. In the source model, we preliminarily propose the self-calibration strategy to prevent the model from relying on shortcuts and make overconfident predictions. Then, we further design self-improvement strategy in target model to reduce the reliance on multiple shortcuts. The random mask strategy involves randomly masking partial attention positions to diversify the focus of target model other than concentrating on a fixed region. Meanwhile, the adaptive attention alignment module facilitates the alignment of attention weights to the calibrated source model, without the need for post-hoc attention maps or supervision. Finally, extensive experiments conducted on Natural Language Processing (NLP) and Computer Vision (CV) demonstrate the effectiveness of MiMu in improving robustness generalization abilities.
Related papers
- Reinforcement Learning for Machine Learning Model Deployment: Evaluating Multi-Armed Bandits in ML Ops Environments [0.0]
We investigate whether reinforcement learning (RL)-based model management can manage deployment decisions more effectively.<n>Our approach enables more adaptive production environments by continuously evaluating deployed models and rolling back underperforming ones in real-time.<n>Our findings suggest that RL-based model management can improve automation, reduce reliance on manual interventions, and mitigate risks associated with post-deployment model failures.
arXiv Detail & Related papers (2025-03-28T16:42:21Z) - PB-UAP: Hybrid Universal Adversarial Attack For Image Segmentation [15.702469692874816]
We propose a novel universal adversarial attack method designed for segmentation models.<n>Our method achieves high attack success rates surpassing the state-of-the-art methods, and exhibits strong transferability across different models.
arXiv Detail & Related papers (2024-12-21T14:46:01Z) - InferAligner: Inference-Time Alignment for Harmlessness through
Cross-Model Guidance [56.184255657175335]
We develop textbfInferAligner, a novel inference-time alignment method that utilizes cross-model guidance for harmlessness alignment.
Experimental results show that our method can be very effectively applied to domain-specific models in finance, medicine, and mathematics.
It significantly diminishes the Attack Success Rate (ASR) of both harmful instructions and jailbreak attacks, while maintaining almost unchanged performance in downstream tasks.
arXiv Detail & Related papers (2024-01-20T10:41:03Z) - Learning Optimal Features via Partial Invariance [18.552839725370383]
Invariant Risk Minimization (IRM) is a popular framework that aims to learn robust models from multiple environments.
We show that IRM can over-constrain the predictor and to remedy this, we propose a relaxation via $textitpartial invariance$.
Several experiments, conducted both in linear settings as well as with deep neural networks on tasks over both language and image data, allow us to verify our conclusions.
arXiv Detail & Related papers (2023-01-28T02:48:14Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner.
SiMT generates the target model by adapting from the temporal ensemble of the meta-learner.
We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z) - Ensemble Making Few-Shot Learning Stronger [4.17701749612924]
This paper explores an ensemble approach to reduce the variance and introduces fine-tuning and feature attention strategies to calibrate relation-level features.
Results on several few-shot relation learning tasks show that our model significantly outperforms the previous state-of-the-art models.
arXiv Detail & Related papers (2021-05-12T17:11:10Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z) - Effective Unsupervised Domain Adaptation with Adversarially Trained
Language Models [54.569004548170824]
We show that careful masking strategies can bridge the knowledge gap of masked language models.
We propose an effective training strategy by adversarially masking out those tokens which are harder to adversarial by the underlying.
arXiv Detail & Related papers (2020-10-05T01:49:47Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.