Retrofitting Light-weight Language Models for Emotions using Supervised
Contrastive Learning
- URL: http://arxiv.org/abs/2310.18930v1
- Date: Sun, 29 Oct 2023 07:43:34 GMT
- Title: Retrofitting Light-weight Language Models for Emotions using Supervised
Contrastive Learning
- Authors: Sapan Shah, Sreedhar Reddy, Pushpak Bhattacharyya
- Abstract summary: We present a novel method to induce emotion aspects into pre-trained language models (PLMs) such as BERT and RoBERTa.
Our method updates pre-trained network weights using contrastive learning so that the text fragments exhibiting similar emotions are encoded nearby in the representation space.
- Score: 44.17782674872344
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel retrofitting method to induce emotion aspects into
pre-trained language models (PLMs) such as BERT and RoBERTa. Our method updates
pre-trained network weights using contrastive learning so that the text
fragments exhibiting similar emotions are encoded nearby in the representation
space, and the fragments with different emotion content are pushed apart. While
doing so, it also ensures that the linguistic knowledge already present in PLMs
is not inadvertently perturbed. The language models retrofitted by our method,
i.e., BERTEmo and RoBERTaEmo, produce emotion-aware text representations, as
evaluated through different clustering and retrieval metrics. For the
downstream tasks on sentiment analysis and sarcasm detection, they perform
better than their pre-trained counterparts (about 1% improvement in F1-score)
and other existing approaches. Additionally, a more significant boost in
performance is observed for the retrofitted models over pre-trained ones in
few-shot learning setting.
Related papers
- Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve? [19.34040322172224]
We show that training a model on a text domain could degrade its perplexity on the test portion of the same domain.
Our findings will guide us in determining when to adapt a model vs when to rely on its foundational capabilities.
arXiv Detail & Related papers (2024-10-08T00:37:16Z) - Lexicon-Level Contrastive Visual-Grounding Improves Language Modeling [47.7950860342515]
LexiContrastive Grounding (LCG) is a grounded language learning procedure that leverages visual supervision to improve textual representations.
LCG outperforms standard language-only models in learning efficiency.
It improves upon vision-and-language learning procedures including CLIP, GIT, Flamingo, and Vokenization.
arXiv Detail & Related papers (2024-03-21T16:52:01Z) - Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Fuzzy Fingerprinting Transformer Language-Models for Emotion Recognition
in Conversations [0.7874708385247353]
We propose to combine the two approaches to perform Emotion Recognition in Conversations (ERC)
We feed utterances and their previous conversational turns to a pre-trained RoBERTa, obtaining contextual embedding utterance representations.
We validate our approach on the widely used DailyDialog ERC benchmark dataset.
arXiv Detail & Related papers (2023-09-08T12:26:01Z) - Improving Policy Learning via Language Dynamics Distillation [87.27583619910338]
We propose Language Dynamics Distillation (LDD), which pretrains a model to predict environment dynamics given demonstrations with language descriptions.
We show that language descriptions in demonstrations improve sample-efficiency and generalization across environments.
arXiv Detail & Related papers (2022-09-30T19:56:04Z) - Vision-Language Pre-Training for Boosting Scene Text Detectors [57.08046351495244]
We specifically adapt vision-language joint learning for scene text detection.
We propose to learn contextualized, joint representations through vision-language pre-training.
The pre-trained model is able to produce more informative representations with richer semantics.
arXiv Detail & Related papers (2022-04-29T03:53:54Z) - Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.
We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z) - Linguistically-Informed Transformations (LIT): A Method for
Automatically Generating Contrast Sets [13.706520309917634]
We propose a Linguistically-Informed Transformation (LIT) method to automatically generate contrast sets.
Experiments show that current pretrained language models struggle on our automatically generated contrast sets.
We improve models' performance on the contrast sets by apply-ing LIT to augment the training data, without affecting performance on the original data.
arXiv Detail & Related papers (2020-10-16T18:23:05Z) - InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language
Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training.
We propose a new pre-training task based on contrastive learning.
By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.