Learning ULMFiT and Self-Distillation with Calibration for Medical
Dialogue System
- URL: http://arxiv.org/abs/2107.09625v1
- Date: Tue, 20 Jul 2021 17:11:24 GMT
- Title: Learning ULMFiT and Self-Distillation with Calibration for Medical
Dialogue System
- Authors: Shuang Ao, Xeno Acharya
- Abstract summary: In recent years, the introduction of state-of-the-art deep learning models and transfer learning techniques have contributed to the performance of NLP tasks.
Some deep neural networks are poorly calibrated and wrongly estimate the uncertainty.
In this paper, we investigate the well-calibrated model for ULMFiT and self-distillation in a medical dialogue system.
- Score: 2.055949720959582
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A medical dialogue system is essential for healthcare service as providing
primary clinical advice and diagnoses. It has been gradually adopted and
practiced in medical organizations in the form of a conversational bot, largely
due to the advancement of NLP. In recent years, the introduction of
state-of-the-art deep learning models and transfer learning techniques like
Universal Language Model Fine Tuning (ULMFiT) and Knowledge Distillation (KD)
largely contributes to the performance of NLP tasks. However, some deep neural
networks are poorly calibrated and wrongly estimate the uncertainty. Hence the
model is not trustworthy, especially in sensitive medical decision-making
systems and safety tasks. In this paper, we investigate the well-calibrated
model for ULMFiT and self-distillation (SD) in a medical dialogue system. The
calibrated ULMFiT (CULMFiT) is obtained by incorporating label smoothing (LS),
a commonly used regularization technique to achieve a well-calibrated model.
Moreover, we apply the technique to recalibrate the confidence score called
temperature scaling (TS) with KD to observe its correlation with network
calibration. To further understand the relation between SD and calibration, we
use both fixed and optimal temperatures to fine-tune the whole model. All
experiments are conducted on the consultation backpain dataset collected by
experts then further validated using a large publicly medial dialogue corpus.
We empirically show that our proposed methodologies outperform conventional
methods in terms of accuracy and robustness.
Related papers
- Empowering Healthcare through Privacy-Preserving MRI Analysis [3.6394715554048234]
We introduce the Ensemble-Based Federated Learning (EBFL) Framework.
EBFL framework deviates from the conventional approach by emphasizing model features over sharing sensitive patient data.
We have achieved remarkable precision in the classification of brain tumors, including glioma, meningioma, pituitary, and non-tumor instances.
arXiv Detail & Related papers (2024-03-14T19:51:18Z) - Specialty detection in the context of telemedicine in a highly
imbalanced multi-class distribution [3.992328888937568]
The study focuses on handling multiclass and highly imbalanced datasets for Arabic medical questions.
The proposed module is deployed in both synchronous and asynchronous medical consultations.
arXiv Detail & Related papers (2024-02-21T06:39:04Z) - Integrating Physician Diagnostic Logic into Large Language Models:
Preference Learning from Process Feedback [20.73076574974894]
We propose an approach called preference learning from process feedback.
PLPF integrates the doctor's diagnostic logic into LLMs.
We show that PLPF enhances the diagnostic accuracy of the baseline model in medical conversations by 17.6%.
arXiv Detail & Related papers (2024-01-11T06:42:45Z) - Improving Multiple Sclerosis Lesion Segmentation Across Clinical Sites:
A Federated Learning Approach with Noise-Resilient Training [75.40980802817349]
Deep learning models have shown promise for automatically segmenting MS lesions, but the scarcity of accurately annotated data hinders progress in this area.
We introduce a Decoupled Hard Label Correction (DHLC) strategy that considers the imbalanced distribution and fuzzy boundaries of MS lesions.
We also introduce a Centrally Enhanced Label Correction (CELC) strategy, which leverages the aggregated central model as a correction teacher for all sites.
arXiv Detail & Related papers (2023-08-31T00:36:10Z) - Multi-Site Clinical Federated Learning using Recursive and Attentive
Models and NVFlare [13.176351544342735]
This paper develops an integrated framework that addresses data privacy and regulatory compliance challenges.
It includes the development of an integrated framework that addresses data privacy and regulatory compliance challenges while maintaining elevated accuracy and substantiating the efficacy of the proposed approach.
arXiv Detail & Related papers (2023-06-28T17:00:32Z) - Exploiting prompt learning with pre-trained language models for
Alzheimer's Disease detection [70.86672569101536]
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and to delay further progression.
This paper investigates the use of prompt-based fine-tuning of PLMs that consistently uses AD classification errors as the training objective function.
arXiv Detail & Related papers (2022-10-29T09:18:41Z) - DLTTA: Dynamic Learning Rate for Test-time Adaptation on Cross-domain
Medical Images [56.72015587067494]
We propose a novel dynamic learning rate adjustment method for test-time adaptation, called DLTTA.
Our method achieves effective and fast test-time adaptation with consistent performance improvement over current state-of-the-art test-time adaptation methods.
arXiv Detail & Related papers (2022-05-27T02:34:32Z) - Solving Inverse Problems in Medical Imaging with Score-Based Generative
Models [87.48867245544106]
Reconstructing medical images from partial measurements is an important inverse problem in Computed Tomography (CT) and Magnetic Resonance Imaging (MRI)
Existing solutions based on machine learning typically train a model to directly map measurements to medical images.
We propose a fully unsupervised technique for inverse problem solving, leveraging the recently introduced score-based generative models.
arXiv Detail & Related papers (2021-11-15T05:41:12Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z) - Natural Language Processing with Deep Learning for Medical Adverse Event
Detection from Free-Text Medical Narratives: A Case Study of Detecting Total
Hip Replacement Dislocation [0.0]
We propose deep learning based NLP (DL-NLP) models for efficient and accurate hip dislocation AE detection following total hip replacement.
We benchmarked these proposed models with a wide variety of traditional machine learning based NLP (ML-NLP) models.
All DL-NLP models out-performed all of the ML-NLP models, with a convolutional neural network (CNN) model achieving the best overall performance.
arXiv Detail & Related papers (2020-04-17T16:25:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.