Disentangled Contrastive Learning for Learning Robust Textual
Representations
- URL: http://arxiv.org/abs/2104.04907v1
- Date: Sun, 11 Apr 2021 03:32:49 GMT
- Title: Disentangled Contrastive Learning for Learning Robust Textual
Representations
- Authors: Xiang Chen, Xin Xie, Zhen Bi, Hongbin Ye, Shumin Deng, Ningyu Zhang,
Huajun Chen
- Abstract summary: We introduce the concept of momentum representation consistency to align features and leverage power normalization while conforming the uniformity.
Our experimental results for the NLP benchmarks demonstrate that our approach can obtain better results compared with the baselines.
- Score: 13.880693856907037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although the self-supervised pre-training of transformer models has resulted
in the revolutionizing of natural language processing (NLP) applications and
the achievement of state-of-the-art results with regard to various benchmarks,
this process is still vulnerable to small and imperceptible permutations
originating from legitimate inputs. Intuitively, the representations should be
similar in the feature space with subtle input permutations, while large
variations occur with different meanings. This motivates us to investigate the
learning of robust textual representation in a contrastive manner. However, it
is non-trivial to obtain opposing semantic instances for textual samples. In
this study, we propose a disentangled contrastive learning method that
separately optimizes the uniformity and alignment of representations without
negative sampling. Specifically, we introduce the concept of momentum
representation consistency to align features and leverage power normalization
while conforming the uniformity. Our experimental results for the NLP
benchmarks demonstrate that our approach can obtain better results compared
with the baselines, as well as achieve promising improvements with invariance
tests and adversarial attacks. The code is available in
https://github.com/zjunlp/DCL.
Related papers
- ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.
Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.
We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - Addressing Order Sensitivity of In-Context Demonstration Examples in Causal Language Models [18.03259038587496]
In-context learning can be significantly influenced by the order of in-context demonstration examples.
We introduce an unsupervised fine-tuning method, termed the Information-Augmented and Consistency-Enhanced approach.
Our proposed method can reduce the sensitivity of CausalLMs to the order of in-context examples and exhibit robust generalizability.
arXiv Detail & Related papers (2024-02-23T22:39:12Z) - DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z) - Regularizing with Pseudo-Negatives for Continual Self-Supervised Learning [62.40718385934608]
We introduce a novel Pseudo-Negative Regularization (PNR) framework for effective continual self-supervised learning (CSSL)
Our PNR leverages pseudo-negatives obtained through model-based augmentation in a way that newly learned representations may not contradict what has been learned in the past.
arXiv Detail & Related papers (2023-06-08T10:59:35Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Semantic-aware Contrastive Learning for More Accurate Semantic Parsing [32.74456368167872]
We propose a semantic-aware contrastive learning algorithm, which can learn to distinguish fine-grained meaning representations.
Experiments on two standard datasets show that our approach achieves significant improvements over MLE baselines.
arXiv Detail & Related papers (2023-01-19T07:04:32Z) - Beyond Instance Discrimination: Relation-aware Contrastive
Self-supervised Learning [75.46664770669949]
We present relation-aware contrastive self-supervised learning (ReCo) to integrate instance relations.
Our ReCo consistently gains remarkable performance improvements.
arXiv Detail & Related papers (2022-11-02T03:25:28Z) - Rethinking Prototypical Contrastive Learning through Alignment,
Uniformity and Correlation [24.794022951873156]
We propose to learn Prototypical representation through Alignment, Uniformity and Correlation (PAUC)
Specifically, the ordinary ProtoNCE loss is revised with: (1) an alignment loss that pulls embeddings from positive prototypes together; (2) a loss that distributes the prototypical level features uniformly; (3) a correlation loss that increases the diversity and discriminability between prototypical level features.
arXiv Detail & Related papers (2022-10-18T22:33:12Z) - Robust Textual Embedding against Word-level Adversarial Attacks [15.235449552083043]
We propose a novel robust training method, termed Fast Triplet Metric Learning (FTML)
We show that FTML can significantly promote the model robustness against various advanced adversarial attacks.
Our work shows the great potential of improving the textual robustness through robust word embedding.
arXiv Detail & Related papers (2022-02-28T14:25:00Z) - Improving Transformation Invariance in Contrastive Representation
Learning [31.223892428863238]
We introduce a training objective for contrastive learning that uses a novel regularizer to control how the representation changes under transformation.
Second, we propose a change to how test time representations are generated by introducing a feature averaging approach that combines encodings from multiple transformations of the original input.
Third, we introduce the novel Spirograph dataset to explore our ideas in the context of a differentiable generative process with multiple downstream tasks.
arXiv Detail & Related papers (2020-10-19T13:49:29Z) - A Simple but Tough-to-Beat Data Augmentation Approach for Natural
Language Understanding and Generation [53.8171136907856]
We introduce a set of simple yet effective data augmentation strategies dubbed cutoff.
cutoff relies on sampling consistency and thus adds little computational overhead.
cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
arXiv Detail & Related papers (2020-09-29T07:08:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.