Fine-Tuning Medical Language Models for Enhanced Long-Contextual Understanding and Domain Expertise
- URL: http://arxiv.org/abs/2407.11536v1
- Date: Tue, 16 Jul 2024 09:37:20 GMT
- Title: Fine-Tuning Medical Language Models for Enhanced Long-Contextual Understanding and Domain Expertise
- Authors: Qimin Yang, Rongsheng Wang, Jiexin Chen, Runqi Su, Tao Tan,
- Abstract summary: Large Language Models (LLMs) have been widely applied in various professional fields.
We observed that despite improvements in specific domain knowledge, the performance of medical LLM in long-context understanding has significantly declined.
- Score: 2.1869349221557814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have been widely applied in various professional fields. By fine-tuning the models using domain specific question and answer datasets, the professional domain knowledge and Q\&A abilities of these models have significantly improved, for example, medical professional LLMs that use fine-tuning of doctor-patient Q\&A data exhibit extraordinary disease diagnostic abilities. However, we observed that despite improvements in specific domain knowledge, the performance of medical LLM in long-context understanding has significantly declined, especially compared to general language models with similar parameters. The purpose of this study is to investigate the phenomenon of reduced performance in understanding long-context in medical LLM. We designed a series of experiments to conduct open-book professional knowledge exams on all models to evaluate their ability to read long-context. By adjusting the proportion and quantity of general data and medical data in the process of fine-tuning, we can determine the best data composition to optimize the professional model and achieve a balance between long-context performance and specific domain knowledge.
Related papers
- LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation [0.0]
This study introduces a novel "LLMs-in-the-loop" approach to develop supervised neural machine translation models optimized for medical texts.
Custom parallel corpora in six languages were compiled from scientific articles, synthetically generated clinical documents, and medical texts.
Our MarianMT-based models outperform Google Translate, DeepL, and GPT-4-Turbo.
arXiv Detail & Related papers (2024-07-16T19:32:23Z) - SemioLLM: Assessing Large Language Models for Semiological Analysis in Epilepsy Research [45.2233252981348]
Large Language Models have shown promising results in their ability to encode general medical knowledge.
We test the ability of state-of-the-art LLMs to leverage their internal knowledge and reasoning for epilepsy diagnosis.
arXiv Detail & Related papers (2024-07-03T11:02:12Z) - M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering [14.198330378235632]
We use Multiple Choice and Abstractive Question Answering to conduct a large-scale empirical study on 22 datasets in three generalist and three specialist biomedical sub-domains.
Our multifaceted analysis of the performance of 15 LLMs uncovers success factors such as instruction tuning that lead to improved recall and comprehension.
We show that while recently proposed domain-adapted models may lack adequate knowledge, directly fine-tuning on our collected medical knowledge datasets shows encouraging results.
We complement the quantitative results with a skill-oriented manual error analysis, which reveals a significant gap between the models' capabilities to simply recall necessary knowledge and to integrate it with the presented
arXiv Detail & Related papers (2024-06-06T02:43:21Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Unveiling the Generalization Power of Fine-Tuned Large Language Models [81.70754292058258]
We investigate whether fine-tuning affects the intrinsic generalization ability intrinsic to Large Language Models (LLMs)
Our main findings reveal that models fine-tuned on generation and classification tasks exhibit dissimilar behaviors in generalizing to different domains and tasks.
We observe that integrating the in-context learning strategy during fine-tuning on generation tasks can enhance the model's generalization ability.
arXiv Detail & Related papers (2024-03-14T08:18:59Z) - REALM: RAG-Driven Enhancement of Multimodal Electronic Health Records
Analysis via Large Language Models [19.62552013839689]
Existing models often lack the medical context relevent to clinical tasks, prompting the incorporation of external knowledge.
We propose REALM, a Retrieval-Augmented Generation (RAG) driven framework to enhance multimodal EHR representations.
Our experiments on MIMIC-III mortality and readmission tasks showcase the superior performance of our REALM framework over baselines.
arXiv Detail & Related papers (2024-02-10T18:27:28Z) - MedLM: Exploring Language Models for Medical Question Answering Systems [2.84801080855027]
Large Language Models (LLMs) with their advanced generative capabilities have shown promise in various NLP tasks.
This study aims to compare the performance of general and medical-specific distilled LMs for medical Q&A.
The findings will provide valuable insights into the suitability of different LMs for specific applications in the medical domain.
arXiv Detail & Related papers (2024-01-21T03:37:47Z) - EpilepsyLLM: Domain-Specific Large Language Model Fine-tuned with
Epilepsy Medical Knowledge [28.409333447902693]
Large language models (LLMs) achieve remarkable performance in comprehensive and generative ability.
In this work, we focus on the particular disease of Epilepsy with Japanese language and introduce a customized LLM termed as EpilepsyLLM.
The datasets contain knowledge of basic information about disease, common treatment methods and drugs, and important notes in life and work.
arXiv Detail & Related papers (2024-01-11T13:39:00Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z) - PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z) - An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT [80.33783969507458]
The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians.
Recent studies have achieved promising results in automatic impression generation using large-scale medical text data.
These models often require substantial amounts of medical text data and have poor generalization performance.
arXiv Detail & Related papers (2023-04-17T17:13:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.