Related papers: CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

URL: http://arxiv.org/abs/2306.05317v1
Date: Thu, 8 Jun 2023 16:08:10 GMT
Title: CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models
Authors: Potsawee Manakul, Yassir Fathullah, Adian Liusie, Vyas Raina, Vatsal Raina, Mark Gales
Abstract summary: We consider the challenge of summarizing patients' medical progress notes in a limited data setting. For the Problem List Summarization (shared task 1A) at the BioNLP Workshop 2023, we demonstrate that Clinical-T5 fine-tuned to 765 medical clinic notes outperforms other extractive, abstractive and zero-shot baselines.
Score: 8.237131071390715
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we consider the challenge of summarizing patients' medical progress notes in a limited data setting. For the Problem List Summarization (shared task 1A) at the BioNLP Workshop 2023, we demonstrate that Clinical-T5 fine-tuned to 765 medical clinic notes outperforms other extractive, abstractive and zero-shot baselines, yielding reasonable baseline systems for medical note summarization. Further, we introduce Hierarchical Ensemble of Summarization Models (HESM), consisting of token-level ensembles of diverse fine-tuned Clinical-T5 models, followed by Minimum Bayes Risk (MBR) decoding. Our HESM approach lead to a considerable summarization performance boost, and when evaluated on held-out challenge data achieved a ROUGE-L of 32.77, which was the best-performing system at the top of the shared task leaderboard.

Related papers

CSTRL: Context-Driven Sequential Transfer Learning for Abstractive Radiology Report Summarization [0.37109226820205005]
A radiology report comprises several sections, including the Findings and Impression of the diagnosis. Pretrained models that excel in common abstractive summarization problems encounter challenges when applied to specialized medical domains. We introduce a sequential transfer learning that ensures key content extraction and coherent summarization.
arXiv Detail & Related papers (2025-02-21T08:32:11Z)
Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking [58.25862290294702]
We present MedChain, a dataset of 12,163 clinical cases that covers five key stages of clinical workflow. We also propose MedChain-Agent, an AI system that integrates a feedback mechanism and a MCase-RAG module to learn from previous cases and adapt its responses.
arXiv Detail & Related papers (2024-12-02T15:25:02Z)
Enhanced Electronic Health Records Text Summarization Using Large Language Models [0.0]
This project builds on prior work by creating a system that generates clinician-preferred, focused summaries. The proposed system leverages the Flan-T5 model to generate tailored EHR summaries based on clinician-specified topics.
arXiv Detail & Related papers (2024-10-12T19:36:41Z)
Towards Evaluating and Building Versatile Large Language Models for Medicine [57.49547766838095]
We present MedS-Bench, a benchmark designed to evaluate the performance of large language models (LLMs) in clinical contexts. MedS-Bench spans 11 high-level clinical tasks, including clinical report summarization, treatment recommendations, diagnosis, named entity recognition, and medical concept explanation. MedS-Ins comprises 58 medically oriented language corpora, totaling 13.5 million samples across 122 tasks.
arXiv Detail & Related papers (2024-08-22T17:01:34Z)
Generating Faithful and Complete Hospital-Course Summaries from the Electronic Health Record [3.6513957125331555]
An unintended consequence of the increased documentation burden has been reduced face-time with patients. We propose and evaluate automated solutions for generating a summary of a patient's hospital admissions.
arXiv Detail & Related papers (2024-04-01T15:47:21Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z)
Overview of the Problem List Summarization (ProbSum) 2023 Shared Task on Summarizing Patients' Active Diagnoses and Problems from Electronic Health Record Progress Notes [5.222442967088892]
The BioNLP Workshop 2023 initiated the launch of a shared task on Problem List Summarization (ProbSum) The goal for participants is to develop models that generated a list of diagnoses and problems using input from the daily care notes collected from the hospitalization of critically ill patients. Eight teams submitted their final systems to the shared task leaderboard.
arXiv Detail & Related papers (2023-06-08T15:19:57Z)
PULSAR: Pre-training with Extracted Healthcare Terms for Summarising Patients' Problems and Data Augmentation with Black-box Large Language Models [25.363775123262307]
Automatic summarisation of a patient's problems in the form of a problem list can aid stakeholders in understanding a patient's condition, reducing workload and cognitive bias. BioNLP 2023 Shared Task 1A focuses on generating a list of diagnoses and problems from the provider's progress notes during hospitalisation. One component employs large language models (LLMs) for data augmentation; the other is an abstractive summarisation LLM with a novel pre-training objective for generating the patients' problems summarised as a list. Our approach was ranked second among all submissions to the shared task.
arXiv Detail & Related papers (2023-06-05T10:17:50Z)
COLO: A Contrastive Learning based Re-ranking Framework for One-Stage Summarization [84.70895015194188]
We propose a Contrastive Learning based re-ranking framework for one-stage summarization called COLO. COLO boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score.
arXiv Detail & Related papers (2022-09-29T06:11:21Z)
WSSS4LUAD: Grand Challenge on Weakly-supervised Tissue Semantic Segmentation for Lung Adenocarcinoma [51.50991881342181]
This challenge includes 10,091 patch-level annotations and over 130 million labeled pixels. First place team achieved mIoU of 0.8413 (tumor: 0.8389, stroma: 0.7931, normal: 0.8919)
arXiv Detail & Related papers (2022-04-13T15:27:05Z)
Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching. We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders. We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z)
Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community. We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence. We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization [22.062385543743293]
Sequence-to-sequence (seq2seq) network is a well-established model for text summarization task. In this paper, we approach the content selection problem for clinical abstractive summarization by augmenting salient ontological terms into the summarizer.
arXiv Detail & Related papers (2020-05-01T01:12:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.