Adapting Biomedical Abstracts into Plain language using Large Language Models
- URL: http://arxiv.org/abs/2501.15700v1
- Date: Sun, 26 Jan 2025 23:07:35 GMT
- Title: Adapting Biomedical Abstracts into Plain language using Large Language Models
- Authors: Haritha Gangavarapu, Giridhar Kaushik Ramachandran, Kevin Lybarger, Meliha Yetisgen, Özlem Uzuner,
- Abstract summary: The goal of the Plain Language Adaptation of Biomedical Abstracts (PLABA) track is to adapt the biomedical abstracts in English language extracted from PubMed based on the questions asked in MedlinePlus for the general public using plain language at the sentence level.
We leveraged the best open-source Large Language Models suitable and fine-tuned for dialog use cases.
Our top performing GPT-4 based model ranked first in the avg. simplicity measure and 3rd on the avg. accuracy measure.
- Score: 8.497188292342052
- License:
- Abstract: A vast amount of medical knowledge is available for public use through online health forums, and question-answering platforms on social media. The majority of the population in the United States doesn't have the right amount of health literacy to make the best use of that information. Health literacy means the ability to obtain and comprehend the basic health information to make appropriate health decisions. To build the bridge between this gap, organizations advocate adapting this medical knowledge into plain language. Building robust systems to automate the adaptations helps both medical and non-medical professionals best leverage the available information online. The goal of the Plain Language Adaptation of Biomedical Abstracts (PLABA) track is to adapt the biomedical abstracts in English language extracted from PubMed based on the questions asked in MedlinePlus for the general public using plain language at the sentence level. As part of this track, we leveraged the best open-source Large Language Models suitable and fine-tuned for dialog use cases. We compare and present the results for all of our systems and our ranking among the other participants' submissions. Our top performing GPT-4 based model ranked first in the avg. simplicity measure and 3rd on the avg. accuracy measure.
Related papers
- IIMedGPT: Promoting Large Language Model Capabilities of Medical Tasks by Efficient Human Preference Alignment [6.022433954095106]
We introduce a medical instruction dataset, CMedINS, containing six medical instructions derived from actual medical tasks.
We then launch our medical model, IIMedGPT, employing an efficient preference alignment method.
The results show that our final model outperforms existing medical models in medical dialogue.
arXiv Detail & Related papers (2025-01-06T09:22:36Z) - IMAS: A Comprehensive Agentic Approach to Rural Healthcare Delivery [0.0]
This paper proposes an advanced agentic medical assistant system designed to improve healthcare delivery in rural areas.
The system is composed of five crucial components: translation, medical complexity assessment, expert network integration, and response simplification.
Evaluation results using the MedQA, PubMedQA, and JAMA datasets demonstrate that this integrated approach significantly enhances the effectiveness of rural healthcare workers.
arXiv Detail & Related papers (2024-10-13T23:07:11Z) - Efficient Medical Question Answering with Knowledge-Augmented Question Generation [5.145812785735094]
We introduce a method to improve the proficiency of a small language model in the medical domain by employing a two-fold approach.
We first fine-tune the model on a corpus of medical textbooks.
Then, we use GPT-4 to generate questions similar to the downstream task, prompted with textbook knowledge, and use them to fine-tune the model.
arXiv Detail & Related papers (2024-05-23T14:53:52Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain.
ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF.
We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z) - LLaVA-Med: Training a Large Language-and-Vision Assistant for
Biomedicine in One Day [85.19963303642427]
We propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images.
The model first learns to align biomedical vocabulary using the figure-caption pairs as is, then learns to master open-ended conversational semantics.
This enables us to train a Large Language and Vision Assistant for BioMedicine in less than 15 hours (with eight A100s)
arXiv Detail & Related papers (2023-06-01T16:50:07Z) - PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z) - HealthE: Classifying Entities in Online Textual Health Advice [0.0]
We release a new annotated dataset, HealthE, consisting of 6,756 health advice.
HealthE has a more granular label space compared to existing medical NER corpora.
We introduce a new health entity classification model, EP S-BERT, which leverages textual context patterns in the classification of entity classes.
arXiv Detail & Related papers (2022-10-06T23:18:24Z) - Towards more patient friendly clinical notes through language models and
ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling.
We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians.
Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z) - Assessing the Severity of Health States based on Social Media Posts [62.52087340582502]
We propose a multiview learning framework that models both the textual content as well as contextual-information to assess the severity of the user's health state.
The diverse NLU views demonstrate its effectiveness on both the tasks and as well as on the individual disease to assess a user's health.
arXiv Detail & Related papers (2020-09-21T03:45:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.