Enhancing Clinical Text Classification via Fine-Tuned DRAGON Longformer Models
- URL: http://arxiv.org/abs/2507.09470v1
- Date: Sun, 13 Jul 2025 03:10:19 GMT
- Title: Enhancing Clinical Text Classification via Fine-Tuned DRAGON Longformer Models
- Authors: Mingchuan Yang, Ziyuan Huang,
- Abstract summary: This study explores the optimization of the DRAGON Longformer base model for clinical text classification.<n>A dataset of 500 clinical cases containing structured medical observations was used.<n>The optimized model achieved notable performance gains.
- Score: 7.514574388197471
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study explores the optimization of the DRAGON Longformer base model for clinical text classification, specifically targeting the binary classification of medical case descriptions. A dataset of 500 clinical cases containing structured medical observations was used, with 400 cases for training and 100 for validation. Enhancements to the pre-trained joeranbosma/dragon-longformer-base-mixed-domain model included hyperparameter tuning, domain-specific preprocessing, and architectural adjustments. Key modifications involved increasing sequence length from 512 to 1024 tokens, adjusting learning rates from 1e-05 to 5e-06, extending training epochs from 5 to 8, and incorporating specialized medical terminology. The optimized model achieved notable performance gains: accuracy improved from 72.0% to 85.2%, precision from 68.0% to 84.1%, recall from 75.0% to 86.3%, and F1-score from 71.0% to 85.2%. Statistical analysis confirmed the significance of these improvements (p < .001). The model demonstrated enhanced capability in interpreting medical terminology, anatomical measurements, and clinical observations. These findings contribute to domain-specific language model research and offer practical implications for clinical natural language processing applications. The optimized model's strong performance across diverse medical conditions underscores its potential for broad use in healthcare settings.
Related papers
- Agentic large language models improve retrieval-based radiology question answering [4.340742745938289]
Agentic retrieval significantly improved mean diagnostic accuracy over zero-shot prompting.<n>The greatest gains occurred in midsized models.<n>Even clinically fine-tuned models exhibited meaningful improvements.
arXiv Detail & Related papers (2025-08-01T16:18:52Z) - Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training [0.0]
We present Gazal-R1, a 32-billion- parameter language model that achieves state-of-the-art performance in medical reasoning.<n>Our model demonstrates that strategic training can enable mid-sized models to outperform significantly larger counterparts in specialized domains.<n>Gazal-R1 achieves exceptional performance across medical benchmarks, scoring 87.1% on MedQA, 81.6% on MMLU Pro (Medical), and 79.6% on PubMedQA.
arXiv Detail & Related papers (2025-06-18T09:44:21Z) - A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment [46.776978552161395]
Small language models (SLMs) offer a cost-effective alternative to large language models such as GPT-4.<n>SLMs offer a cost-effective alternative, but their limited capacity requires biomedical domain adaptation.<n>We propose a novel framework for adapting SLMs into high-performing clinical models.
arXiv Detail & Related papers (2025-05-15T21:40:21Z) - SemioLLM: Evaluating Large Language Models for Diagnostic Reasoning from Unstructured Clinical Narratives in Epilepsy [45.2233252981348]
Large Language Models (LLMs) have been shown to encode clinical knowledge.<n>We present SemioLLM, an evaluation framework that benchmarks 6 state-of-the-art models.<n>We show that most LLMs are able to accurately and confidently generate probabilistic predictions of seizure onset zones in the brain.
arXiv Detail & Related papers (2024-07-03T11:02:12Z) - A Comprehensive Evaluation of Histopathology Foundation Models for Ovarian Cancer Subtype Classification [1.9499122087408571]
Histopathology foundation models show great promise across many tasks.
We report the most rigorous single-task validation of histopathology foundation models to date.
Histopathology foundation models offer a clear benefit to ovarian cancer subtyping.
arXiv Detail & Related papers (2024-05-16T11:21:02Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Large Language Models to Identify Social Determinants of Health in
Electronic Health Records [2.168737004368243]
Social determinants of health (SDoH) have an important impact on patient outcomes but are incompletely collected from the electronic health records (EHRs)
This study researched the ability of large language models to extract SDoH from free text in EHRs, where they are most commonly documented.
800 patient notes were annotated for SDoH categories, and several transformer-based models were evaluated.
arXiv Detail & Related papers (2023-08-11T19:18:35Z) - MEDBERT.de: A Comprehensive German BERT Model for the Medical Domain [45.96917694724562]
medBERTde is a pre-trained German BERT model specifically designed for the German medical domain.
The model has been trained on a large corpus of 4.7 Million German medical documents.
arXiv Detail & Related papers (2023-03-14T18:58:08Z) - Domain-adapted large language models for classifying nuclear medicine
reports [11.364745410780678]
We retrospectively retrieved 4542 text reports and 1664 images for FDG PET/CT lymphoma exams from 2008-2018.
Multiple general-purpose transformer language models were used to classify the reports into Deauville scores 1-5.
We adapted the models to the nuclear medicine domain using masked language modeling and assessed its impact on classification performance.
arXiv Detail & Related papers (2023-03-01T09:48:39Z) - Textual Data Augmentation for Patient Outcomes Prediction [67.72545656557858]
We propose a novel data augmentation method to generate artificial clinical notes in patients' Electronic Health Records.
We fine-tune the generative language model GPT-2 to synthesize labeled text with the original training data.
We evaluate our method on the most common patient outcome, i.e., the 30-day readmission rate.
arXiv Detail & Related papers (2022-11-13T01:07:23Z) - On the explainability of hospitalization prediction on a large COVID-19
patient dataset [45.82374977939355]
We develop various AI models to predict hospitalization on a large (over 110$k$) cohort of COVID-19 positive-tested US patients.
Despite high data unbalance, the models reach average precision 0.96-0.98 (0.75-0.85), recall 0.96-0.98 (0.74-0.85), and $F_score 0.97-0.98 (0.79-0.83) on the non-hospitalized (or hospitalized) class.
arXiv Detail & Related papers (2021-10-28T10:23:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.