On Adversarial Examples for Biomedical NLP Tasks
- URL: http://arxiv.org/abs/2004.11157v1
- Date: Thu, 23 Apr 2020 13:46:11 GMT
- Title: On Adversarial Examples for Biomedical NLP Tasks
- Authors: Vladimir Araujo, Andres Carvallo, Carlos Aspillaga and Denis Parra
- Abstract summary: We propose an adversarial evaluation scheme on two well-known datasets for medical NER and STS.
We show that we can significantly improve the robustness of the models by training them with adversarial examples.
- Score: 4.7677261488999205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of pre-trained word embeddings has motivated its use in tasks in
the biomedical domain. The BERT language model has shown remarkable results on
standard performance metrics in tasks such as Named Entity Recognition (NER)
and Semantic Textual Similarity (STS), which has brought significant progress
in the field of NLP. However, it is unclear whether these systems work
seemingly well in critical domains, such as legal or medical. For that reason,
in this work, we propose an adversarial evaluation scheme on two well-known
datasets for medical NER and STS. We propose two types of attacks inspired by
natural spelling errors and typos made by humans. We also propose another type
of attack that uses synonyms of medical terms. Under these adversarial
settings, the accuracy of the models drops significantly, and we quantify the
extent of this performance loss. We also show that we can significantly improve
the robustness of the models by training them with adversarial examples. We
hope our work will motivate the use of adversarial examples to evaluate and
develop models with increased robustness for medical tasks.
Related papers
- On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models [59.45628259925441]
Volumetric medical segmentation models have achieved significant success on organ and tumor-based segmentation tasks.
Their vulnerability to adversarial attacks remains largely unexplored.
This underscores the importance of investigating the robustness of existing models.
arXiv Detail & Related papers (2024-06-12T17:59:42Z) - Perturbation-Invariant Adversarial Training for Neural Ranking Models:
Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents.
This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs.
In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z) - SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - How far is Language Model from 100% Few-shot Named Entity Recognition in Medical Domain [14.635536657783613]
This paper aims to compare the performance of LMs in medical few-shot NER and answer How far is LMs from 100% Few-shot NER in Medical Domain.
Our findings clearly indicate that LLMs outperform SLMs in few-shot medical NER tasks, given the presence of suitable examples and appropriate logical frameworks.
We introduce a simple and effective method called textscRT (Retrieving and Thinking), which serves as retrievers, finding relevant examples, and as thinkers, employing a step-by-step reasoning process.
arXiv Detail & Related papers (2023-07-01T01:18:09Z) - Detecting Adversarial Examples in Batches -- a geometrical approach [0.0]
We introduce two geometric metrics, density and coverage, and evaluate their use in detecting adversarial samples in batches of unseen data.
We empirically study these metrics using MNIST and two real-world biomedical datasets from MedMNIST, subjected to two different adversarial attacks.
Our experiments show promising results for both metrics to detect adversarial examples.
arXiv Detail & Related papers (2022-06-17T12:52:43Z) - Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of
Language Models [86.02610674750345]
Adversarial GLUE (AdvGLUE) is a new multi-task benchmark to explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks.
We apply 14 adversarial attack methods to GLUE tasks to construct AdvGLUE, which is further validated by humans for reliable annotations.
All the language models and robust training methods we tested perform poorly on AdvGLUE, with scores lagging far behind the benign accuracy.
arXiv Detail & Related papers (2021-11-04T12:59:55Z) - Stress Test Evaluation of Biomedical Word Embeddings [3.8376078864105425]
We systematically evaluate three language models with adversarial examples.
We show that adversarial training causes the models to improve their robustness and even to exceed the original performance in some cases.
arXiv Detail & Related papers (2021-07-24T16:45:03Z) - BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for
Text Classification [1.14219428942199]
We propose BBAEG (Biomedical BERT-based Adversarial Example Generation), a black-box attack algorithm for biomedical text classification.
We demonstrate that BBAEG performs stronger attack with better language fluency, semantic coherence as compared to prior work.
arXiv Detail & Related papers (2021-04-05T05:32:56Z) - NUVA: A Naming Utterance Verifier for Aphasia Treatment [49.114436579008476]
Assessment of speech performance using picture naming tasks is a key method for both diagnosis and monitoring of responses to treatment interventions by people with aphasia (PWA)
Here we present NUVA, an utterance verification system incorporating a deep learning element that classifies 'correct' versus'incorrect' naming attempts from aphasic stroke patients.
When tested on eight native British-English speaking PWA the system's performance accuracy ranged between 83.6% to 93.6%, with a 10-fold cross-validation mean of 89.5%.
arXiv Detail & Related papers (2021-02-10T13:00:29Z) - Domain specific BERT representation for Named Entity Recognition of lab
protocol [0.0]
BERT family seems to work exceptionally well on the downstream task from NER tagging to the range of other linguistic tasks.
In this paper, we are going to illustrate the System for Named Entity Tagging based on Bio-Bert.
arXiv Detail & Related papers (2020-12-21T06:54:38Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.