Related papers: Representation Projection Invariance Mitigates Representation Collapse

Representation Projection Invariance Mitigates Representation Collapse

URL: http://arxiv.org/abs/2205.11603v3
Date: Tue, 21 Nov 2023 22:23:43 GMT
Title: Representation Projection Invariance Mitigates Representation Collapse
Authors: Anastasia Razdaibiedina, Ashish Khetan, Zohar Karnin, Daniel Khashabi, Vishaal Kapoor, Vivek Madan
Abstract summary: Fine-tuning contextualized representations learned by pre-trained language models can lead to representation collapse. We propose Representation Projection Invariance (REPINA), a novel regularization method to maintain the information content of representation. Our empirical findings show that REPINA is significantly more effective at mitigating representation collapse.
Score: 27.485606184566386
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fine-tuning contextualized representations learned by pre-trained language models remains a prevalent practice in NLP. However, fine-tuning can lead to representation degradation (also known as representation collapse), which may result in instability, sub-optimal performance, and weak generalization. In this paper, we propose Representation Projection Invariance (REPINA), a novel regularization method to maintain the information content of representation and reduce representation collapse during fine-tuning by discouraging undesirable changes in the representations. We study the empirical behavior of the proposed regularization in comparison to 5 comparable baselines across 13 language understanding tasks (GLUE benchmark and six additional datasets). When evaluating in-domain performance, REPINA consistently outperforms other baselines on most tasks (10 out of 13). We also demonstrate its effectiveness in few-shot settings and robustness to label perturbation. As a by-product, we extend previous studies of representation collapse and propose several metrics to quantify it. Our empirical findings show that our approach is significantly more effective at mitigating representation collapse.

Related papers

Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks [9.390951257874187]
We introduce a standardized protocol to quantify informativeness, equivariance, invariance, and disentanglement of factors of variation in model representations.<n>We find that representations from models with similar downstream performance can behave substantially differently with regard to these attributes.
arXiv Detail & Related papers (2025-05-09T17:58:52Z)
Are Representation Disentanglement and Interpretability Linked in Recommendation Models? A Critical Review and Reproducibility Study [12.013380880264439]
Unsupervised learning of disentangled representations has been closely tied to enhancing the representation intepretability of Recommender Systems (RSs) In this work, we reproduce the recommendation performance, representation disentanglement and representation interpretability of five well-known recommendation models on four RS datasets.
arXiv Detail & Related papers (2025-01-30T23:48:02Z)
A Mutually Reinforced Framework for Pretrained Sentence Embeddings [49.297766436632685]
InfoCSE is a novel framework for learning high-quality sentence embeddings. It exploits the sentence representation model itself and realizes the following iterative self-supervision process. In other words, the representation learning and data annotation become mutually reinforced, where a strong self-supervision effect can be derived.
arXiv Detail & Related papers (2022-02-28T14:00:16Z)
Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference. We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z)
Probing as Quantifying the Inductive Bias of Pre-trained Representations [99.93552997506438]
We present a novel framework for probing where the goal is to evaluate the inductive bias of representations for a particular task. We apply our framework to a series of token-, arc-, and sentence-level tasks.
arXiv Detail & Related papers (2021-10-15T22:01:16Z)
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning [57.4036085386653]
We show that prompt-based models for sentence pair classification tasks still suffer from a common pitfall of adopting inferences based on lexical overlap. We then show that adding a regularization that preserves pretraining weights is effective in mitigating this destructive tendency of few-shot finetuning.
arXiv Detail & Related papers (2021-09-09T10:10:29Z)
Disentangled Contrastive Learning for Learning Robust Textual Representations [13.880693856907037]
We introduce the concept of momentum representation consistency to align features and leverage power normalization while conforming the uniformity. Our experimental results for the NLP benchmarks demonstrate that our approach can obtain better results compared with the baselines.
arXiv Detail & Related papers (2021-04-11T03:32:49Z)
Better Fine-Tuning by Reducing Representational Collapse [77.44854918334232]
Existing approaches for fine-tuning pre-trained language models have been shown to be unstable. We present a method rooted in trust region theory that replaces previously used adversarial objectives with parametric noise. We show it is less prone to representation collapse; the pre-trained models maintain more generalizable representations every time they are fine-tuned.
arXiv Detail & Related papers (2020-08-06T02:13:16Z)
Fairness by Learning Orthogonal Disentangled Representations [50.82638766862974]
We propose a novel disentanglement approach to invariant representation problem. We enforce the meaningful representation to be agnostic to sensitive information by entropy. The proposed approach is evaluated on five publicly available datasets.
arXiv Detail & Related papers (2020-03-12T11:09:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.