SenTest: Evaluating Robustness of Sentence Encoders
- URL: http://arxiv.org/abs/2311.17722v1
- Date: Wed, 29 Nov 2023 15:21:35 GMT
- Title: SenTest: Evaluating Robustness of Sentence Encoders
- Authors: Tanmay Chavan, Shantanu Patankar, Aditya Kane, Omkar Gokhale,
Geetanjali Kale, Raviraj Joshi
- Abstract summary: This work focuses on evaluating the robustness of the sentence encoders.
We employ several adversarial attacks to evaluate its robustness.
The results of the experiments strongly undermine the robustness of sentence encoders.
- Score: 0.4194295877935868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning has proven to be an effective method for pre-training
models using weakly labeled data in the vision domain. Sentence transformers
are the NLP counterparts to this architecture, and have been growing in
popularity due to their rich and effective sentence representations. Having
effective sentence representations is paramount in multiple tasks, such as
information retrieval, retrieval augmented generation (RAG), and sentence
comparison. Keeping in mind the deployability factor of transformers,
evaluating the robustness of sentence transformers is of utmost importance.
This work focuses on evaluating the robustness of the sentence encoders. We
employ several adversarial attacks to evaluate its robustness. This system uses
character-level attacks in the form of random character substitution,
word-level attacks in the form of synonym replacement, and sentence-level
attacks in the form of intra-sentence word order shuffling. The results of the
experiments strongly undermine the robustness of sentence encoders. The models
produce significantly different predictions as well as embeddings on perturbed
datasets. The accuracy of the models can fall up to 15 percent on perturbed
datasets as compared to unperturbed datasets. Furthermore, the experiments
demonstrate that these embeddings does capture the semantic and syntactic
structure (sentence order) of sentences. However, existing supervised
classification strategies fail to leverage this information, and merely
function as n-gram detectors.
Related papers
- Improving the Robustness of Summarization Systems with Dual Augmentation [68.53139002203118]
A robust summarization system should be able to capture the gist of the document, regardless of the specific word choices or noise in the input.
We first explore the summarization models' robustness against perturbations including word-level synonym substitution and noise.
We propose a SummAttacker, which is an efficient approach to generating adversarial samples based on language models.
arXiv Detail & Related papers (2023-06-01T19:04:17Z) - Generate, Discriminate and Contrast: A Semi-Supervised Sentence
Representation Learning Framework [68.04940365847543]
We propose a semi-supervised sentence embedding framework, GenSE, that effectively leverages large-scale unlabeled data.
Our method include three parts: 1) Generate: A generator/discriminator model is jointly trained to synthesize sentence pairs from open-domain unlabeled corpus; 2) Discriminate: Noisy sentence pairs are filtered out by the discriminator to acquire high-quality positive and negative sentence pairs; 3) Contrast: A prompt-based contrastive approach is presented for sentence representation learning with both annotated and synthesized data.
arXiv Detail & Related papers (2022-10-30T10:15:21Z) - Non-Linguistic Supervision for Contrastive Learning of Sentence
Embeddings [14.244787327283335]
We find the performance of Transformer models as sentence encoders can be improved by training with multi-modal multi-task losses.
The reliance of our framework on unpaired non-linguistic data makes it language-agnostic, enabling it to be widely applicable beyond English NLP.
arXiv Detail & Related papers (2022-09-20T03:01:45Z) - A Differentiable Language Model Adversarial Attack on Text Classifiers [10.658675415759697]
We propose a new black-box sentence-level attack for natural language processing.
Our method fine-tunes a pre-trained language model to generate adversarial examples.
We show that the proposed attack outperforms competitors on a diverse set of NLP problems for both computed metrics and human evaluation.
arXiv Detail & Related papers (2021-07-23T14:43:13Z) - Experiments with adversarial attacks on text genres [0.0]
Neural models based on pre-trained transformers, such as BERT or XLM-RoBERTa, demonstrate SOTA results in many NLP tasks.
We show that embedding-based algorithms which can replace some of the most significant'' words with words similar to them, have the ability to influence model predictions in a significant proportion of cases.
arXiv Detail & Related papers (2021-07-05T19:37:59Z) - Double Perturbation: On the Robustness of Robustness and Counterfactual
Bias Evaluation [109.06060143938052]
We propose a "double perturbation" framework to uncover model weaknesses beyond the test dataset.
We apply this framework to study two perturbation-based approaches that are used to analyze models' robustness and counterfactual bias in English.
arXiv Detail & Related papers (2021-04-12T06:57:36Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z) - Unsupervised Extractive Summarization by Pre-training Hierarchical
Transformers [107.12125265675483]
Unsupervised extractive document summarization aims to select important sentences from a document without using labeled summaries during training.
Existing methods are mostly graph-based with sentences as nodes and edge weights measured by sentence similarities.
We find that transformer attentions can be used to rank sentences for unsupervised extractive summarization.
arXiv Detail & Related papers (2020-10-16T08:44:09Z) - Semi-Supervised Models via Data Augmentationfor Classifying Interactive
Affective Responses [85.04362095899656]
We present semi-supervised models with data augmentation (SMDA), a semi-supervised text classification system to classify interactive affective responses.
For labeled sentences, we performed data augmentations to uniform the label distributions and computed supervised loss during training process.
For unlabeled sentences, we explored self-training by regarding low-entropy predictions over unlabeled sentences as pseudo labels.
arXiv Detail & Related papers (2020-04-23T05:02:31Z) - Stacked DeBERT: All Attention in Incomplete Data for Text Classification [8.900866276512364]
We propose Stacked DeBERT, short for Stacked Denoising Bidirectional Representations from Transformers.
Our model shows improved F1-scores and better robustness in informal/incorrect texts present in tweets and in texts with Speech-to-Text error in sentiment and intent classification tasks.
arXiv Detail & Related papers (2020-01-01T04:49:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.