Deep learning models are not robust against noise in clinical text
- URL: http://arxiv.org/abs/2108.12242v1
- Date: Fri, 27 Aug 2021 12:47:19 GMT
- Title: Deep learning models are not robust against noise in clinical text
- Authors: Milad Moradi, Kathrin Blagec, Matthias Samwald
- Abstract summary: We introduce and implement a variety of perturbation methods that simulate different types of noise and variability in clinical text data.
We evaluate the robustness of high-performance NLP models against various types of character-level and word-level noise.
- Score: 6.158031973715943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial Intelligence (AI) systems are attracting increasing interest in
the medical domain due to their ability to learn complicated tasks that require
human intelligence and expert knowledge. AI systems that utilize
high-performance Natural Language Processing (NLP) models have achieved
state-of-the-art results on a wide variety of clinical text processing
benchmarks. They have even outperformed human accuracy on some tasks. However,
performance evaluation of such AI systems have been limited to accuracy
measures on curated and clean benchmark datasets that may not properly reflect
how robustly these systems can operate in real-world situations. In order to
address this challenge, we introduce and implement a wide variety of
perturbation methods that simulate different types of noise and variability in
clinical text data. While noisy samples produced by these perturbation methods
can often be understood by humans, they may cause AI systems to make erroneous
decisions. Conducting extensive experiments on several clinical text processing
tasks, we evaluated the robustness of high-performance NLP models against
various types of character-level and word-level noise. The results revealed
that the NLP models performance degrades when the input contains small amounts
of noise. This study is a significant step towards exposing vulnerabilities of
AI models utilized in clinical text processing systems. The proposed
perturbation methods can be used in performance evaluation tests to assess how
robustly clinical NLP models can operate on noisy data, in real-world settings.
Related papers
- Fine-tuning -- a Transfer Learning approach [0.22344294014777952]
Missingness in Electronic Health Records (EHRs) is often hampered by the abundance of missing data in this valuable resource.
Existing deep imputation methods rely on end-to-end pipelines that incorporate both imputation and downstream analyses.
This paper explores the development of a modular, deep learning-based imputation and classification pipeline.
arXiv Detail & Related papers (2024-11-06T14:18:23Z) - How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples.
We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics.
When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z) - Robustness and Generalization Performance of Deep Learning Models on
Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise.
We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z) - Importance of methodological choices in data manipulation for validating
epileptic seizure detection models [4.538319875483978]
Epilepsy is a chronic neurological disorder that affects a significant portion of the human population and imposes serious risks in the daily life of patients.
Despite advances in machine learning and IoT, small, nonstigmatizing wearable devices for continuous monitoring and detection in outpatient environments are not yet available.
Part of the reason is the complexity of epilepsy itself, including highly imbalanced data, multimodal nature, and very subject-specific signatures.
This article identifies a wide range of methodological decisions that must be made and reported when training and evaluating the performance of epilepsy detection systems.
arXiv Detail & Related papers (2023-02-21T13:44:13Z) - On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances.
We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Improving the robustness and accuracy of biomedical language models
through adversarial training [7.064032374579076]
Deep transformer neural network models have improved the predictive accuracy of intelligent text processing systems in the biomedical domain.
Neural NLP models can be easily fooled by adversarial samples, i.e. minor changes to input that preserve the meaning and understandability of the text but force the NLP system to make erroneous decisions.
This raises serious concerns about the security and trust-worthiness of biomedical NLP systems.
arXiv Detail & Related papers (2021-11-16T14:58:05Z) - Understanding Model Robustness to User-generated Noisy Texts [2.958690090551675]
In NLP, model performance often deteriorates with naturally occurring noise, such as spelling errors.
We propose to model the errors statistically from grammatical-error-correction corpora.
arXiv Detail & Related papers (2021-10-14T14:54:52Z) - Evaluating the Robustness of Neural Language Models to Input
Perturbations [7.064032374579076]
In this study, we design and implement various types of character-level and word-level perturbation methods to simulate noisy input texts.
We investigate the ability of high-performance language models such as BERT, XLNet, RoBERTa, and ELMo in handling different types of input perturbations.
The results suggest that language models are sensitive to input perturbations and their performance can decrease even when small changes are introduced.
arXiv Detail & Related papers (2021-08-27T12:31:17Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z) - High Dimensional Level Set Estimation with Bayesian Neural Network [58.684954492439424]
This paper proposes novel methods to solve the high dimensional Level Set Estimation problems using Bayesian Neural Networks.
For each problem, we derive the corresponding theoretic information based acquisition function to sample the data points.
Numerical experiments on both synthetic and real-world datasets show that our proposed method can achieve better results compared to existing state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-17T23:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.