Related papers: Learning ULMFiT and Self-Distillation with Calibration for Medical Dialogue System

Learning ULMFiT and Self-Distillation with Calibration for Medical Dialogue System

URL: http://arxiv.org/abs/2107.09625v1
Date: Tue, 20 Jul 2021 17:11:24 GMT
Title: Learning ULMFiT and Self-Distillation with Calibration for Medical Dialogue System
Authors: Shuang Ao, Xeno Acharya
Abstract summary: In recent years, the introduction of state-of-the-art deep learning models and transfer learning techniques have contributed to the performance of NLP tasks. Some deep neural networks are poorly calibrated and wrongly estimate the uncertainty. In this paper, we investigate the well-calibrated model for ULMFiT and self-distillation in a medical dialogue system.
Score: 2.055949720959582
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A medical dialogue system is essential for healthcare service as providing primary clinical advice and diagnoses. It has been gradually adopted and practiced in medical organizations in the form of a conversational bot, largely due to the advancement of NLP. In recent years, the introduction of state-of-the-art deep learning models and transfer learning techniques like Universal Language Model Fine Tuning (ULMFiT) and Knowledge Distillation (KD) largely contributes to the performance of NLP tasks. However, some deep neural networks are poorly calibrated and wrongly estimate the uncertainty. Hence the model is not trustworthy, especially in sensitive medical decision-making systems and safety tasks. In this paper, we investigate the well-calibrated model for ULMFiT and self-distillation (SD) in a medical dialogue system. The calibrated ULMFiT (CULMFiT) is obtained by incorporating label smoothing (LS), a commonly used regularization technique to achieve a well-calibrated model. Moreover, we apply the technique to recalibrate the confidence score called temperature scaling (TS) with KD to observe its correlation with network calibration. To further understand the relation between SD and calibration, we use both fixed and optimal temperatures to fine-tune the whole model. All experiments are conducted on the consultation backpain dataset collected by experts then further validated using a large publicly medial dialogue corpus. We empirically show that our proposed methodologies outperform conventional methods in terms of accuracy and robustness.

Related papers

Structured Outputs Enable General-Purpose LLMs to be Medical Experts [50.02627258858336]
Large language models (LLMs) often struggle with open-ended medical questions. We propose a novel approach utilizing structured medical reasoning. Our approach achieves the highest Factuality Score of 85.8, surpassing fine-tuned models.
arXiv Detail & Related papers (2025-03-05T05:24:55Z)
Systematic Literature Review on Clinical Trial Eligibility Matching [0.24554686192257422]
Review highlights how explainable AI and standardized ontology can bolster clinician trust and broaden adoption. Further research into advanced semantic and temporal representations, expanded data integration, and rigorous prospective evaluations is necessary to fully realize the transformative potential of NLP in clinical trial recruitment.
arXiv Detail & Related papers (2025-03-02T11:45:50Z)
Dialogue is Better Than Monologue: Instructing Medical LLMs via Strategical Conversations [74.83732294523402]
We introduce a novel benchmark that simulates real-world diagnostic scenarios, integrating noise and difficulty levels aligned with USMLE standards. We also explore dialogue-based fine-tuning, which transforms static datasets into conversational formats to better capture iterative reasoning processes. Experiments show that dialogue-tuned models outperform traditional methods, with improvements of $9.64%$ in multi-round reasoning scenarios and $6.18%$ in accuracy in a noisy environment.
arXiv Detail & Related papers (2025-01-29T18:58:48Z)
CareBot: A Pioneering Full-Process Open-Source Medical Language Model [8.868481107848185]
CareBot is a bilingual medical LLM that integrates continuous pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning with human feedback (RLHF) DataRater is a model designed to assess data quality during CPT, ensuring that the training data is both accurate and relevant. Our rigorous evaluations on Chinese and English benchmarks confirm CareBot's effectiveness in medical consultation and education.
arXiv Detail & Related papers (2024-12-12T05:27:43Z)
Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering [51.26412822853409]
We present a novel personalized federated learning (pFL) method for medical visual question answering (VQA) models. Our method introduces learnable prompts into a Transformer architecture to efficiently train it on diverse medical datasets without massive computational costs.
arXiv Detail & Related papers (2024-10-23T00:31:17Z)
Empowering Healthcare through Privacy-Preserving MRI Analysis [3.6394715554048234]
We introduce the Ensemble-Based Federated Learning (EBFL) Framework. EBFL framework deviates from the conventional approach by emphasizing model features over sharing sensitive patient data. We have achieved remarkable precision in the classification of brain tumors, including glioma, meningioma, pituitary, and non-tumor instances.
arXiv Detail & Related papers (2024-03-14T19:51:18Z)
Specialty detection in the context of telemedicine in a highly imbalanced multi-class distribution [3.992328888937568]
The study focuses on handling multiclass and highly imbalanced datasets for Arabic medical questions. The proposed module is deployed in both synchronous and asynchronous medical consultations.
arXiv Detail & Related papers (2024-02-21T06:39:04Z)
Integrating Physician Diagnostic Logic into Large Language Models: Preference Learning from Process Feedback [19.564416963801268]
We propose an approach called preference learning from process feedback. PLPF integrates the doctor's diagnostic logic into LLMs. We show that PLPF enhances the diagnostic accuracy of the baseline model in medical conversations by 17.6%.
arXiv Detail & Related papers (2024-01-11T06:42:45Z)
Improving Multiple Sclerosis Lesion Segmentation Across Clinical Sites: A Federated Learning Approach with Noise-Resilient Training [75.40980802817349]
Deep learning models have shown promise for automatically segmenting MS lesions, but the scarcity of accurately annotated data hinders progress in this area. We introduce a Decoupled Hard Label Correction (DHLC) strategy that considers the imbalanced distribution and fuzzy boundaries of MS lesions. We also introduce a Centrally Enhanced Label Correction (CELC) strategy, which leverages the aggregated central model as a correction teacher for all sites.
arXiv Detail & Related papers (2023-08-31T00:36:10Z)
Multi-Site Clinical Federated Learning using Recursive and Attentive Models and NVFlare [13.176351544342735]
This paper develops an integrated framework that addresses data privacy and regulatory compliance challenges. It includes the development of an integrated framework that addresses data privacy and regulatory compliance challenges while maintaining elevated accuracy and substantiating the efficacy of the proposed approach.
arXiv Detail & Related papers (2023-06-28T17:00:32Z)
Exploiting prompt learning with pre-trained language models for Alzheimer's Disease detection [70.86672569101536]
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and to delay further progression. This paper investigates the use of prompt-based fine-tuning of PLMs that consistently uses AD classification errors as the training objective function.
arXiv Detail & Related papers (2022-10-29T09:18:41Z)
DLTTA: Dynamic Learning Rate for Test-time Adaptation on Cross-domain Medical Images [56.72015587067494]
We propose a novel dynamic learning rate adjustment method for test-time adaptation, called DLTTA. Our method achieves effective and fast test-time adaptation with consistent performance improvement over current state-of-the-art test-time adaptation methods.
arXiv Detail & Related papers (2022-05-27T02:34:32Z)
Solving Inverse Problems in Medical Imaging with Score-Based Generative Models [87.48867245544106]
Reconstructing medical images from partial measurements is an important inverse problem in Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) Existing solutions based on machine learning typically train a model to directly map measurements to medical images. We propose a fully unsupervised technique for inverse problem solving, leveraging the recently introduced score-based generative models.
arXiv Detail & Related papers (2021-11-15T05:41:12Z)
ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings. We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework. The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z)
Natural Language Processing with Deep Learning for Medical Adverse Event Detection from Free-Text Medical Narratives: A Case Study of Detecting Total Hip Replacement Dislocation [0.0]
We propose deep learning based NLP (DL-NLP) models for efficient and accurate hip dislocation AE detection following total hip replacement. We benchmarked these proposed models with a wide variety of traditional machine learning based NLP (ML-NLP) models. All DL-NLP models out-performed all of the ML-NLP models, with a convolutional neural network (CNN) model achieving the best overall performance.
arXiv Detail & Related papers (2020-04-17T16:25:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.