Adding more data does not always help: A study in medical conversation
summarization with PEGASUS
- URL: http://arxiv.org/abs/2111.07564v1
- Date: Mon, 15 Nov 2021 07:27:35 GMT
- Title: Adding more data does not always help: A study in medical conversation
summarization with PEGASUS
- Authors: Varun Nair, Namit Katariya, Xavier Amatriain, Ilya Valmianski, Anitha
Kannan
- Abstract summary: We study the effect of dataset size on transfer learning medical conversation summarization using PEG.
We also evaluate various iterative labeling strategies in the low-data regime, following their success in the classification setting.
Our work sheds light on the successes and challenges of translating low-data regime techniques in classification to medical conversation summarization.
- Score: 5.276054618115727
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical conversation summarization is integral in capturing information
gathered during interactions between patients and physicians. Summarized
conversations are used to facilitate patient hand-offs between physicians, and
as part of providing care in the future. Summaries, however, can be
time-consuming to produce and require domain expertise. Modern pre-trained NLP
models such as PEGASUS have emerged as capable alternatives to human
summarization, reaching state-of-the-art performance on many summarization
benchmarks. However, many downstream tasks still require at least moderately
sized datasets to achieve satisfactory performance. In this work we (1) explore
the effect of dataset size on transfer learning medical conversation
summarization using PEGASUS and (2) evaluate various iterative labeling
strategies in the low-data regime, following their success in the
classification setting. We find that model performance saturates with increase
in dataset size and that the various active-learning strategies evaluated all
show equivalent performance consistent with simple dataset size increase. We
also find that naive iterative pseudo-labeling is on-par or slightly worse than
no pseudo-labeling. Our work sheds light on the successes and challenges of
translating low-data regime techniques in classification to medical
conversation summarization and helps guides future work in this space. Relevant
code available at
\url{https://github.com/curai/curai-research/tree/main/medical-summarization-ML4H-2021}.
Related papers
- LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model [55.80651780294357]
State-of-the-art medical multi-modal large language models (med-MLLM) leverage instruction-following data in pre-training.
LoGra-Med is a new multi-graph alignment algorithm that enforces triplet correlations across image modalities, conversation-based descriptions, and extended captions.
Our results show LoGra-Med matches LLAVA-Med performance on 600K image-text pairs for Medical VQA and significantly outperforms it when trained on 10% of the data.
arXiv Detail & Related papers (2024-10-03T15:52:03Z) - Pseudo Label-Guided Data Fusion and Output Consistency for
Semi-Supervised Medical Image Segmentation [9.93871075239635]
We propose the PLGDF framework, which builds upon the mean teacher network for segmenting medical images with less annotation.
We propose a novel pseudo-label utilization scheme, which combines labeled and unlabeled data to augment the dataset effectively.
Our framework yields superior performance compared to six state-of-the-art semi-supervised learning methods.
arXiv Detail & Related papers (2023-11-17T06:36:43Z) - SHAPE: A Sample-adaptive Hierarchical Prediction Network for Medication
Recommendation [22.899946140205962]
We propose a novel Sample-adaptive Hierarchical medicAtion Prediction nEtwork, termed SHAPE, to tackle the challenges in the medication recommendation task.
Specifically, we design a compact intra-visit set encoder to encode the relationship in the medical event for obtaining visit-level representation.
To endow the model with the capability of modeling the variable visit length, we introduce a soft curriculum learning method to assign the difficulty of each sample automatically by the visit length.
arXiv Detail & Related papers (2023-09-09T08:28:04Z) - MedNgage: A Dataset for Understanding Engagement in Patient-Nurse
Conversations [4.847266237348932]
Patients who effectively manage their symptoms often demonstrate higher levels of engagement in conversations and interventions with healthcare practitioners.
It is crucial for AI systems to understand the engagement in natural conversations between patients and practitioners to better contribute toward patient care.
We present a novel dataset (MedNgage) which consists of patient-nurse conversations about cancer symptom management.
arXiv Detail & Related papers (2023-05-31T16:06:07Z) - Medical Question Summarization with Entity-driven Contrastive Learning [12.008269098530386]
This paper proposes a novel medical question summarization framework using entity-driven contrastive learning (ECL)
ECL employs medical entities in frequently asked questions (FAQs) as focuses and devises an effective mechanism to generate hard negative samples.
We find that some MQA datasets suffer from serious data leakage problems, such as the iCliniq dataset's 33% duplicate rate.
arXiv Detail & Related papers (2023-04-15T00:19:03Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Does Synthetic Data Generation of LLMs Help Clinical Text Mining? [51.205078179427645]
We investigate the potential of OpenAI's ChatGPT to aid in clinical text mining.
We propose a new training paradigm that involves generating a vast quantity of high-quality synthetic data.
Our method has resulted in significant improvements in the performance of downstream tasks.
arXiv Detail & Related papers (2023-03-08T03:56:31Z) - MedDistant19: A Challenging Benchmark for Distantly Supervised
Biomedical Relation Extraction [19.046156065686308]
Distant supervision is commonly used to tackle the scarcity of annotated data.
Bio-DSRE models can seemingly produce very accurate results in several benchmarks.
However, given the challenging nature of the task, we set out to investigate the validity of such impressive results.
arXiv Detail & Related papers (2022-04-10T22:07:25Z) - Towards Robust Partially Supervised Multi-Structure Medical Image
Segmentation on Small-Scale Data [123.03252888189546]
We propose Vicinal Labels Under Uncertainty (VLUU) to bridge the methodological gaps in partially supervised learning (PSL) under data scarcity.
Motivated by multi-task learning and vicinal risk minimization, VLUU transforms the partially supervised problem into a fully supervised problem by generating vicinal labels.
Our research suggests a new research direction in label-efficient deep learning with partial supervision.
arXiv Detail & Related papers (2020-11-28T16:31:00Z) - ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised
Medical Image Segmentation [99.90263375737362]
We propose ATSO, an asynchronous version of teacher-student optimization.
ATSO partitions the unlabeled data into two subsets and alternately uses one subset to fine-tune the model and updates the label on the other subset.
We evaluate ATSO on two popular medical image segmentation datasets and show its superior performance in various semi-supervised settings.
arXiv Detail & Related papers (2020-06-24T04:05:12Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.