A Simple yet Effective Self-Debiasing Framework for Transformer Models
- URL: http://arxiv.org/abs/2306.01907v1
- Date: Fri, 2 Jun 2023 20:31:58 GMT
- Title: A Simple yet Effective Self-Debiasing Framework for Transformer Models
- Authors: Xiaoyue Wang, Lijie Wang, Xin Liu, Suhang Wu, Jinsong Su, Hua Wu
- Abstract summary: Current Transformer-based natural language understanding (NLU) models heavily rely on dataset biases.
We propose a simple yet effective self-debiasing framework for Transformer-based NLU models.
- Score: 49.09053367249642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current Transformer-based natural language understanding (NLU) models heavily
rely on dataset biases, while failing to handle real-world out-of-distribution
(OOD) instances. Many methods have been proposed to deal with this issue, but
they ignore the fact that the features learned in different layers of
Transformer-based NLU models are different. In this paper, we first conduct
preliminary studies to obtain two conclusions: 1) both low- and high-layer
sentence representations encode common biased features during training; 2) the
low-layer sentence representations encode fewer unbiased features than the
highlayer ones. Based on these conclusions, we propose a simple yet effective
self-debiasing framework for Transformer-based NLU models. Concretely, we first
stack a classifier on a selected low layer. Then, we introduce a residual
connection that feeds the low-layer sentence representation to the top-layer
classifier. In this way, the top-layer sentence representation will be trained
to ignore the common biased features encoded by the low-layer sentence
representation and focus on task-relevant unbiased features. During inference,
we remove the residual connection and directly use the top-layer sentence
representation to make predictions. Extensive experiments and indepth analyses
on NLU tasks show that our framework performs better than several competitive
baselines, achieving a new SOTA on all OOD test sets.
Related papers
- Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A
Two-Stage Approach to Mitigate Social Biases [13.837927115198308]
We propose an adversarial training-inspired two-stage debiasing model using Contrastive learning and Continuous Prompt Augmentation.
Our approach guides the model to achieve stronger debiasing performance by adding difficulty to the training process.
arXiv Detail & Related papers (2023-07-04T09:35:03Z) - Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net
Estimation and Optimization [58.90989478049686]
Bi-Drop is a fine-tuning strategy that selectively updates model parameters using gradients from various sub-nets.
Experiments on the GLUE benchmark demonstrate that Bi-Drop consistently outperforms previous fine-tuning methods.
arXiv Detail & Related papers (2023-05-24T06:09:26Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Conceptor-Aided Debiasing of Large Language Models [1.0435741631709405]
Pre-trained large language models (LLMs) reflect the inherent social biases of their training corpus.
We use conceptors--a soft projection method--to identify and remove the bias subspace in LLMs such as BERT and GPT.
We propose two methods of applying conceptors (1) bias subspace projection by post-processing by the conceptor NOT operation; and (2) a new architecture, conceptor-intervened BERT (CI-BERT)
arXiv Detail & Related papers (2022-11-20T21:24:48Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - FairFil: Contrastive Neural Debiasing Method for Pretrained Text
Encoders [68.8687509471322]
We propose the first neural debiasing method for a pretrained sentence encoder, which transforms the pretrained encoder outputs into debiased representations via a fair filter network.
On real-world datasets, our FairFil effectively reduces the bias degree of pretrained text encoders, while continuously showing desirable performance on downstream tasks.
arXiv Detail & Related papers (2021-03-11T02:01:14Z) - Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake
News Detection [7.29381091750894]
We propose a novel transformer-based language model fine-tuning approach for these fake news detection.
First, the token vocabulary of individual model is expanded for the actual semantics of professional phrases.
Last, the predicted features extracted by universal language model RoBERTa and domain-specific model CT-BERT are fused by one multiple layer perception to integrate fine-grained and high-level specific representations.
arXiv Detail & Related papers (2021-01-14T09:05:42Z) - Unsupervised Extractive Summarization by Pre-training Hierarchical
Transformers [107.12125265675483]
Unsupervised extractive document summarization aims to select important sentences from a document without using labeled summaries during training.
Existing methods are mostly graph-based with sentences as nodes and edge weights measured by sentence similarities.
We find that transformer attentions can be used to rank sentences for unsupervised extractive summarization.
arXiv Detail & Related papers (2020-10-16T08:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.