Improving the robustness and accuracy of biomedical language models
through adversarial training
- URL: http://arxiv.org/abs/2111.08529v1
- Date: Tue, 16 Nov 2021 14:58:05 GMT
- Title: Improving the robustness and accuracy of biomedical language models
through adversarial training
- Authors: Milad Moradi, Matthias Samwald
- Abstract summary: Deep transformer neural network models have improved the predictive accuracy of intelligent text processing systems in the biomedical domain.
Neural NLP models can be easily fooled by adversarial samples, i.e. minor changes to input that preserve the meaning and understandability of the text but force the NLP system to make erroneous decisions.
This raises serious concerns about the security and trust-worthiness of biomedical NLP systems.
- Score: 7.064032374579076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep transformer neural network models have improved the predictive accuracy
of intelligent text processing systems in the biomedical domain. They have
obtained state-of-the-art performance scores on a wide variety of biomedical
and clinical Natural Language Processing (NLP) benchmarks. However, the
robustness and reliability of these models has been less explored so far.
Neural NLP models can be easily fooled by adversarial samples, i.e. minor
changes to input that preserve the meaning and understandability of the text
but force the NLP system to make erroneous decisions. This raises serious
concerns about the security and trust-worthiness of biomedical NLP systems,
especially when they are intended to be deployed in real-world use cases. We
investigated the robustness of several transformer neural language models, i.e.
BioBERT, SciBERT, BioMed-RoBERTa, and Bio-ClinicalBERT, on a wide range of
biomedical and clinical text processing tasks. We implemented various
adversarial attack methods to test the NLP systems in different attack
scenarios. Experimental results showed that the biomedical NLP models are
sensitive to adversarial samples; their performance dropped in average by 21
and 18.9 absolute percent on character-level and word-level adversarial noise,
respectively. Conducting extensive adversarial training experiments, we
fine-tuned the NLP models on a mixture of clean samples and adversarial inputs.
Results showed that adversarial training is an effective defense mechanism
against adversarial noise; the models robustness improved in average by 11.3
absolute percent. In addition, the models performance on clean data increased
in average by 2.4 absolute present, demonstrating that adversarial training can
boost generalization abilities of biomedical NLP systems.
Related papers
- BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers [48.21255861863282]
BMRetriever is a series of dense retrievers for enhancing biomedical retrieval.
BMRetriever exhibits strong parameter efficiency, with the 410M variant outperforming baselines up to 11.7 times larger.
arXiv Detail & Related papers (2024-04-29T05:40:08Z) - Physical formula enhanced multi-task learning for pharmacokinetics prediction [54.13787789006417]
A major challenge for AI-driven drug discovery is the scarcity of high-quality data.
We develop a formula enhanced mul-ti-task learning (PEMAL) method that predicts four key parameters of pharmacokinetics simultaneously.
Our experiments reveal that PEMAL significantly lowers the data demand, compared to typical Graph Neural Networks.
arXiv Detail & Related papers (2024-04-16T07:42:55Z) - DKE-Research at SemEval-2024 Task 2: Incorporating Data Augmentation with Generative Models and Biomedical Knowledge to Enhance Inference Robustness [27.14794371879541]
This paper presents a novel data augmentation technique to improve model robustness for biomedical natural language inference in clinical trials.
By generating synthetic examples through semantic perturbations and domain-specific vocabulary replacement, we introduce greater diversity and reduce shortcut learning.
Our approach, combined with multi-task learning and the DeBERTa architecture, achieved significant performance gains on the NLI4CT 2024 benchmark.
arXiv Detail & Related papers (2024-04-14T10:02:47Z) - Improving Biomedical Entity Linking with Retrieval-enhanced Learning [53.24726622142558]
$k$NN-BioEL provides a BioEL model with the ability to reference similar instances from the entire training corpus as clues for prediction.
We show that $k$NN-BioEL outperforms state-of-the-art baselines on several datasets.
arXiv Detail & Related papers (2023-12-15T14:04:23Z) - Bio+Clinical BERT, BERT Base, and CNN Performance Comparison for
Predicting Drug-Review Satisfaction [0.0]
We implement and evaluate several classification models, including a BERT base model, Bio+Clinical BERT, and a simpler CNN.
Results indicate that the medical domain-specific Bio+Clinical BERT model significantly outperformed the general domain base BERT model.
Future research could explore how to capitalize on the specific strengths of each model.
arXiv Detail & Related papers (2023-08-02T20:01:38Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Fine-Tuning Large Neural Language Models for Biomedical Natural Language
Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP.
We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains.
We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z) - Deep learning models are not robust against noise in clinical text [6.158031973715943]
We introduce and implement a variety of perturbation methods that simulate different types of noise and variability in clinical text data.
We evaluate the robustness of high-performance NLP models against various types of character-level and word-level noise.
arXiv Detail & Related papers (2021-08-27T12:47:19Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.