UMASS_BioNLP at MEDIQA-Chat 2023: Can LLMs generate high-quality
synthetic note-oriented doctor-patient conversations?
- URL: http://arxiv.org/abs/2306.16931v1
- Date: Thu, 29 Jun 2023 13:30:41 GMT
- Title: UMASS_BioNLP at MEDIQA-Chat 2023: Can LLMs generate high-quality
synthetic note-oriented doctor-patient conversations?
- Authors: Junda Wang, Zonghai Yao, Avijit Mitra, Samuel Osebe, Zhichao Yang,
Hong Yu
- Abstract summary: This paper presents UMASS_BioNLP team participation in the MEDIQA-Chat 2023 shared task for Task-A and Task-C.
We focus especially on Task-C and propose a novel LLMs cooperation system named a doctor-patient loop to generate high-quality conversation data sets.
The experiment results demonstrate that our approaches yield reasonable performance as evaluated by automatic metrics such as ROUGE, medical concept recall, BLEU, and Self-BLEU.
- Score: 5.858602838586936
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents UMASS_BioNLP team participation in the MEDIQA-Chat 2023
shared task for Task-A and Task-C. We focus especially on Task-C and propose a
novel LLMs cooperation system named a doctor-patient loop to generate
high-quality conversation data sets. The experiment results demonstrate that
our approaches yield reasonable performance as evaluated by automatic metrics
such as ROUGE, medical concept recall, BLEU, and Self-BLEU. Furthermore, we
conducted a comparative analysis between our proposed method and ChatGPT and
GPT-4. This analysis also investigates the potential of utilizing cooperation
LLMs to generate high-quality datasets.
Related papers
- Enhancing Biomedical Knowledge Retrieval-Augmented Generation with Self-Rewarding Tree Search and Proximal Policy Optimization [50.26966969163348]
Large Language Models (LLMs) have shown great potential in the biomedical domain with the advancement of retrieval-augmented generation (RAG)
Existing retrieval-augmented approaches face challenges in addressing diverse queries and documents, particularly for medical knowledge queries.
We propose Self-Rewarding Tree Search (SeRTS) based on Monte Carlo Tree Search (MCTS) and a self-rewarding paradigm.
arXiv Detail & Related papers (2024-06-17T06:48:31Z) - Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - Comparative Analysis of Open-Source Language Models in Summarizing Medical Text Data [5.443548415516227]
Large Language Models (LLMs) have demonstrated superior performance in question answering and summarization tasks on unstructured text data.
We propose an evaluation approach to analyze the performance of open-source LLMs for medical summarization tasks.
arXiv Detail & Related papers (2024-05-25T16:16:22Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - EHR Interaction Between Patients and AI: NoteAid EHR Interaction [7.880641398866267]
This paper introduces the NoteAid EHR Interaction Pipeline, an innovative approach developed using generative LLMs to assist in patient education.
We extract datasets containing 10,000 instances from MIMIC Discharge Summaries and 876 instances from the MADE medical notes collection, respectively, executing the two tasks through the NoteAid EHR Interaction Pipeline.
Through a comprehensive evaluation of the entire dataset using LLM assessment and a rigorous manual evaluation of 64 instances, we showcase the potential of LLMs in patient education.
arXiv Detail & Related papers (2023-12-29T05:13:40Z) - Integrating UMLS Knowledge into Large Language Models for Medical
Question Answering [18.06960842747575]
Large language models (LLMs) have demonstrated powerful text generation capabilities, bringing unprecedented innovation to the healthcare field.
We develop an augmented LLM framework based on the Unified Medical Language System (UMLS), aiming to better serve the healthcare community.
We employ LLaMa2-13b-chat and ChatGPT-3.5 as our benchmark models, and conduct automatic evaluations using the ROUGE Score and BERTScore on 104 questions from the LiveQA test set.
arXiv Detail & Related papers (2023-10-04T12:50:26Z) - Augmenting Black-box LLMs with Medical Textbooks for Clinical Question
Answering [54.13933019557655]
We present a system called LLMs Augmented with Medical Textbooks (LLM-AMT)
LLM-AMT integrates authoritative medical textbooks into the LLMs' framework using plug-and-play modules.
We found that medical textbooks as a retrieval corpus is proven to be a more effective knowledge database than Wikipedia in the medical domain.
arXiv Detail & Related papers (2023-09-05T13:39:38Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - Does Synthetic Data Generation of LLMs Help Clinical Text Mining? [51.205078179427645]
We investigate the potential of OpenAI's ChatGPT to aid in clinical text mining.
We propose a new training paradigm that involves generating a vast quantity of high-quality synthetic data.
Our method has resulted in significant improvements in the performance of downstream tasks.
arXiv Detail & Related papers (2023-03-08T03:56:31Z) - Large Language Models for Biomedical Knowledge Graph Construction:
Information extraction from EMR notes [0.0]
We propose an end-to-end machine learning solution based on large language models (LLMs)
The entities used in the KG construction process are diseases, factors, treatments, as well as manifestations that coexist with the patient while experiencing the disease.
The application of the proposed methodology is demonstrated on age-related macular degeneration.
arXiv Detail & Related papers (2023-01-29T15:52:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.