WisPerMed at BioLaySumm: Adapting Autoregressive Large Language Models for Lay Summarization of Scientific Articles
- URL: http://arxiv.org/abs/2405.11950v1
- Date: Mon, 20 May 2024 10:54:47 GMT
- Title: WisPerMed at BioLaySumm: Adapting Autoregressive Large Language Models for Lay Summarization of Scientific Articles
- Authors: Tabea M. G. Pakull, Hendrik Damm, Ahmad Idrissi-Yaghir, Henning Schäfer, Peter A. Horn, Christoph M. Friedrich,
- Abstract summary: This paper details the efforts of the WisPerMed team in the BioLaySumm2024 Shared Task on automatic lay summarization.
Large language models (LLMs), specifically the BioMistral and Llama3 models, were fine-tuned and employed to create lay summaries.
Experiments demonstrated that fine-tuning generally led to the best performance across most evaluated metrics.
- Score: 0.41716369948557463
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper details the efforts of the WisPerMed team in the BioLaySumm2024 Shared Task on automatic lay summarization in the biomedical domain, aimed at making scientific publications accessible to non-specialists. Large language models (LLMs), specifically the BioMistral and Llama3 models, were fine-tuned and employed to create lay summaries from complex scientific texts. The summarization performance was enhanced through various approaches, including instruction tuning, few-shot learning, and prompt variations tailored to incorporate specific context information. The experiments demonstrated that fine-tuning generally led to the best performance across most evaluated metrics. Few-shot learning notably improved the models' ability to generate relevant and factually accurate texts, particularly when using a well-crafted prompt. Additionally, a Dynamic Expert Selection (DES) mechanism to optimize the selection of text outputs based on readability and factuality metrics was developed. Out of 54 participants, the WisPerMed team reached the 4th place, measured by readability, factuality, and relevance. Determined by the overall score, our approach improved upon the baseline by approx. 5.5 percentage points and was only approx 1.5 percentage points behind the first place.
Related papers
- MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension [59.41495657570397]
We collected a multimodal, multidisciplinary dataset from open-access scientific articles published in Nature Communications journals.
This dataset spans 72 scientific disciplines, ensuring both diversity and quality.
We created benchmarks with various tasks and settings to comprehensively evaluate LMMs' capabilities in understanding scientific figures and content.
arXiv Detail & Related papers (2024-07-06T00:40:53Z) - Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature [80.49349719239584]
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks.
SciRIFF is the first dataset focused on extracting and synthesizing information from research literature across a wide range of scientific fields.
arXiv Detail & Related papers (2024-06-10T21:22:08Z) - WisPerMed at "Discharge Me!": Advancing Text Generation in Healthcare with Large Language Models, Dynamic Expert Selection, and Priming Techniques on MIMIC-IV [0.38084074204911494]
This study aims to leverage state of the art language models to automate generating the "Brief Hospital Course" and "Discharge Instructions" sections of Discharge Summaries.
We investigate how automation can improve documentation accuracy, alleviate clinician burnout, and enhance operational efficacy in healthcare facilities.
arXiv Detail & Related papers (2024-05-18T10:56:45Z) - Classification and Clustering of Sentence-Level Embeddings of Scientific Articles Generated by Contrastive Learning [1.104960878651584]
Our approach consists of fine-tuning transformer language models to generate sentence-level embeddings from scientific articles.
We trained our models on three datasets with contrastive learning.
We show that fine-tuning sentence transformers with contrastive learning and using the generated embeddings in downstream tasks is a feasible approach to sentence classification in scientific articles.
arXiv Detail & Related papers (2024-03-30T02:52:14Z) - Information-Theoretic Distillation for Reference-less Summarization [67.51150817011617]
We present a novel framework to distill a powerful summarizer based on the information-theoretic objective for summarization.
We start off from Pythia-2.8B as the teacher model, which is not yet capable of summarization.
We arrive at a compact but powerful summarizer with only 568M parameters that performs competitively against ChatGPT.
arXiv Detail & Related papers (2024-03-20T17:42:08Z) - Exploring the Effectiveness of Instruction Tuning in Biomedical Language
Processing [19.41164870575055]
This study investigates the potential of instruction tuning for biomedical language processing.
We present a comprehensive, instruction-based model trained on a dataset that consists of approximately $200,000$ instruction-focused samples.
arXiv Detail & Related papers (2023-12-31T20:02:10Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - Large Language Models and Prompt Engineering for Biomedical Query
Focused Multi-Document Summarisation [0.565658124285176]
This paper reports on the use of prompt engineering and GPT-3.5 for biomedical query-focused multi-document summarisation.
Using GPT-3.5 and appropriate prompts, our system top ROUGE-F1 results in the task of obtaining short-paragraph-sized answers to biomedical questions.
arXiv Detail & Related papers (2023-11-09T06:45:04Z) - Neural Summarization of Electronic Health Records [8.784162652042957]
We studied the viability of the automatic generation of various sections of a discharge summary using four state-of-the-art neural network summarization models.
Fine-tuning language models that were previously instruction fine-tuned showed better performance in summarizing entire reports.
arXiv Detail & Related papers (2023-05-24T15:05:53Z) - PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.