MedSumm: A Multimodal Approach to Summarizing Code-Mixed Hindi-English
Clinical Queries
- URL: http://arxiv.org/abs/2401.01596v1
- Date: Wed, 3 Jan 2024 07:58:25 GMT
- Title: MedSumm: A Multimodal Approach to Summarizing Code-Mixed Hindi-English
Clinical Queries
- Authors: Akash Ghosh, Arkadeep Acharya, Prince Jha, Aniket Gaudgaul, Rajdeep
Majumdar, Sriparna Saha, Aman Chadha, Raghav Jain, Setu Sinha, and Shivani
Agarwal
- Abstract summary: We introduce the Multimodal Medical Codemixed Question Summarization MMCQS dataset.
This dataset combines Hindi-English codemixed medical queries with visual aids.
Our dataset, code, and pre-trained models will be made publicly available.
- Score: 16.101969130235055
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the healthcare domain, summarizing medical questions posed by patients is
critical for improving doctor-patient interactions and medical decision-making.
Although medical data has grown in complexity and quantity, the current body of
research in this domain has primarily concentrated on text-based methods,
overlooking the integration of visual cues. Also prior works in the area of
medical question summarisation have been limited to the English language. This
work introduces the task of multimodal medical question summarization for
codemixed input in a low-resource setting. To address this gap, we introduce
the Multimodal Medical Codemixed Question Summarization MMCQS dataset, which
combines Hindi-English codemixed medical queries with visual aids. This
integration enriches the representation of a patient's medical condition,
providing a more comprehensive perspective. We also propose a framework named
MedSumm that leverages the power of LLMs and VLMs for this task. By utilizing
our MMCQS dataset, we demonstrate the value of integrating visual information
from images to improve the creation of medically detailed summaries. This
multimodal strategy not only improves healthcare decision-making but also
promotes a deeper comprehension of patient queries, paving the way for future
exploration in personalized and responsive medical care. Our dataset, code, and
pre-trained models will be made publicly available.
Related papers
- Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - HyperFusion: A Hypernetwork Approach to Multimodal Integration of Tabular and Medical Imaging Data for Predictive Modeling [4.44283662576491]
We present a novel framework based on hypernetworks to fuse clinical imaging and tabular data by conditioning the image processing on the EHR's values and measurements.
We show that our framework outperforms both single-modality models and state-of-the-art MRI-tabular data fusion methods.
arXiv Detail & Related papers (2024-03-20T05:50:04Z) - MedInsight: A Multi-Source Context Augmentation Framework for Generating
Patient-Centric Medical Responses using Large Language Models [3.0874677990361246]
Large Language Models (LLMs) have shown impressive capabilities in generating human-like responses.
We propose MedInsight:a novel retrieval framework that augments LLM inputs with relevant background information.
Experiments on the MTSamples dataset validate MedInsight's effectiveness in generating contextually appropriate medical responses.
arXiv Detail & Related papers (2024-03-13T15:20:30Z) - MedKP: Medical Dialogue with Knowledge Enhancement and Clinical Pathway
Encoding [48.348511646407026]
We introduce the Medical dialogue with Knowledge enhancement and clinical Pathway encoding framework.
The framework integrates an external knowledge enhancement module through a medical knowledge graph and an internal clinical pathway encoding via medical entities and physician actions.
arXiv Detail & Related papers (2024-03-11T10:57:45Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - Developing ChatGPT for Biology and Medicine: A Complete Review of
Biomedical Question Answering [25.569980942498347]
ChatGPT explores a strategic blueprint of question answering (QA) in delivering medical diagnosis, treatment recommendations, and other healthcare support.
This is achieved through the increasing incorporation of medical domain data via natural language processing (NLP) and multimodal paradigms.
arXiv Detail & Related papers (2024-01-15T07:21:16Z) - CLIPSyntel: CLIP and LLM Synergy for Multimodal Question Summarization
in Healthcare [16.033112094191395]
We introduce the Multimodal Medical Question Summarization (MMQS) dataset.
This dataset pairs medical queries with visual aids, facilitating a richer and more nuanced understanding of patient needs.
We also propose a framework, consisting of four modules that identify medical disorders, generate relevant context, filter medical concepts, and craft visually aware summaries.
arXiv Detail & Related papers (2023-12-16T03:02:05Z) - ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain.
ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF.
We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z) - Multi-task Paired Masking with Alignment Modeling for Medical
Vision-Language Pre-training [55.56609500764344]
We propose a unified framework based on Multi-task Paired Masking with Alignment (MPMA) to integrate the cross-modal alignment task into the joint image-text reconstruction framework.
We also introduce a Memory-Augmented Cross-Modal Fusion (MA-CMF) module to fully integrate visual information to assist report reconstruction.
arXiv Detail & Related papers (2023-05-13T13:53:48Z) - PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z) - Knowledge-Empowered Representation Learning for Chinese Medical Reading
Comprehension: Task, Model and Resources [36.960318276653986]
We introduce a multi-target MRC task for the medical domain, whose goal is to predict answers to medical questions and the corresponding support sentences simultaneously.
We propose the Chinese medical BERT model for the task (CMedBERT), which fuses medical knowledge into pre-trained language models.
Experiments show that CMedBERT consistently outperforms strong baselines by fusing context-aware and knowledge-aware token representations.
arXiv Detail & Related papers (2020-08-24T11:23:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.