A Survey for Biomedical Text Summarization: From Pre-trained to Large
Language Models
- URL: http://arxiv.org/abs/2304.08763v2
- Date: Thu, 13 Jul 2023 04:13:17 GMT
- Title: A Survey for Biomedical Text Summarization: From Pre-trained to Large
Language Models
- Authors: Qianqian Xie and Zheheng Luo and Benyou Wang and Sophia Ananiadou
- Abstract summary: We present a systematic review of recent advancements in biomedical text summarization.
We discuss existing challenges and promising future directions in the era of large language models.
To facilitate the research community, we line up open resources including available datasets, recent approaches, codes, evaluation metrics, and the leaderboard in a public project.
- Score: 21.516351027053705
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The exponential growth of biomedical texts such as biomedical literature and
electronic health records (EHRs), poses a significant challenge for clinicians
and researchers to access clinical information efficiently. To tackle this
challenge, biomedical text summarization (BTS) has been proposed as a solution
to support clinical information retrieval and management. BTS aims at
generating concise summaries that distill key information from single or
multiple biomedical documents. In recent years, the rapid advancement of
fundamental natural language processing (NLP) techniques, from pre-trained
language models (PLMs) to large language models (LLMs), has greatly facilitated
the progress of BTS. This growth has led to numerous proposed summarization
methods, datasets, and evaluation metrics, raising the need for a comprehensive
and up-to-date survey for BTS. In this paper, we present a systematic review of
recent advancements in BTS, leveraging cutting-edge NLP techniques from PLMs to
LLMs, to help understand the latest progress, challenges, and future
directions. We begin by introducing the foundational concepts of BTS, PLMs and
LLMs, followed by an in-depth review of available datasets, recent approaches,
and evaluation metrics in BTS. We finally discuss existing challenges and
promising future directions in the era of LLMs. To facilitate the research
community, we line up open resources including available datasets, recent
approaches, codes, evaluation metrics, and the leaderboard in a public project:
https://github.com/KenZLuo/Biomedical-Text-Summarization-Survey/tree/master. We
believe that this survey will be a useful resource to researchers, allowing
them to quickly track recent advancements and provide guidelines for future BTS
research within the research community.
Related papers
- A Survey for Large Language Models in Biomedicine [31.719451674137844]
This review is based on an analysis of 484 publications sourced from databases including PubMed, Web of Science, and arXiv.
We explore the capabilities of LLMs in zero-shot learning across a broad spectrum of biomedical tasks, including diagnostic assistance, drug discovery, and personalized medicine.
We discuss the challenges that LLMs face in the biomedicine domain including data privacy concerns, limited model interpretability, issues with dataset quality, and ethics.
arXiv Detail & Related papers (2024-08-29T12:39:16Z) - SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation [50.26966969163348]
Large Language Models (LLMs) have shown great potential in the biomedical domain with the advancement of retrieval-augmented generation (RAG)
Existing retrieval-augmented approaches face challenges in addressing diverse queries and documents, particularly for medical knowledge queries.
We propose Self-Rewarding Tree Search (SeRTS) based on Monte Carlo Tree Search (MCTS) and a self-rewarding paradigm.
arXiv Detail & Related papers (2024-06-17T06:48:31Z) - MedREQAL: Examining Medical Knowledge Recall of Large Language Models via Question Answering [5.065947993017158]
Large Language Models (LLMs) have demonstrated an impressive ability to encode knowledge during pre-training on large text corpora.
We examine the capability of LLMs to exhibit medical knowledge recall by constructing a novel dataset derived from systematic reviews.
arXiv Detail & Related papers (2024-06-09T16:33:28Z) - Large Language Models in Biomedical and Health Informatics: A Review with Bibliometric Analysis [24.532570258954898]
Large Language Models (LLMs) have rapidly become important tools in Biomedical and Health Informatics (BHI)
This study aims to provide a comprehensive overview of LLM applications in BHI, highlighting their transformative potential and addressing the associated ethical and practical challenges.
arXiv Detail & Related papers (2024-03-24T21:29:39Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - Opportunities and Challenges for ChatGPT and Large Language Models in
Biomedicine and Health [22.858424132819795]
ChatGPT has led to the emergence of diverse applications in the field of biomedicine and health.
We explore the areas of biomedical information retrieval, question answering, medical text summarization, and medical education.
We find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods.
arXiv Detail & Related papers (2023-06-15T20:19:08Z) - LLaVA-Med: Training a Large Language-and-Vision Assistant for
Biomedicine in One Day [85.19963303642427]
We propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images.
The model first learns to align biomedical vocabulary using the figure-caption pairs as is, then learns to master open-ended conversational semantics.
This enables us to train a Large Language and Vision Assistant for BioMedicine in less than 15 hours (with eight A100s)
arXiv Detail & Related papers (2023-06-01T16:50:07Z) - Pre-trained Language Models in Biomedical Domain: A Systematic Survey [33.572502204216256]
Pre-trained language models (PLMs) have been the de facto paradigm for most natural language processing (NLP) tasks.
This paper summarizes the recent progress of pre-trained language models in the biomedical domain and their applications in biomedical downstream tasks.
arXiv Detail & Related papers (2021-10-11T05:30:30Z) - CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark [51.38557174322772]
We present the first Chinese Biomedical Language Understanding Evaluation benchmark.
It is a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification.
We report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.
arXiv Detail & Related papers (2021-06-15T12:25:30Z) - An Analysis of a BERT Deep Learning Strategy on a Technology Assisted
Review Task [91.3755431537592]
Document screening is a central task within Evidenced Based Medicine.
I propose a DL document classification approach with BERT or PubMedBERT embeddings and a DL similarity search path.
I test and evaluate the retrieval effectiveness of my DL strategy on the 2017 and 2018 CLEF eHealth collections.
arXiv Detail & Related papers (2021-04-16T19:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.