Related papers: Advancing Biomedical Text Mining with Community Challenges

Advancing Biomedical Text Mining with Community Challenges

URL: http://arxiv.org/abs/2403.04261v1
Date: Thu, 7 Mar 2024 06:52:51 GMT
Title: Advancing Biomedical Text Mining with Community Challenges
Authors: Hui Zong, Rongrong Wu, Jiaxue Cha, Erman Wu, Jiakun Li, Liang Tao, Zuofeng Li, Buzhou Tang, Bairong Shen
Abstract summary: The field of biomedical research has witnessed a significant increase in the accumulation of vast amounts of textual data. Biomedical text mining, also known as biomedical natural language processing, has garnered great attention. Community challenge evaluation competitions have played an important role in promoting technology innovation.
Score: 5.955528108993928
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The field of biomedical research has witnessed a significant increase in the accumulation of vast amounts of textual data from various sources such as scientific literatures, electronic health records, clinical trial reports, and social media. However, manually processing and analyzing these extensive and complex resources is time-consuming and inefficient. To address this challenge, biomedical text mining, also known as biomedical natural language processing, has garnered great attention. Community challenge evaluation competitions have played an important role in promoting technology innovation and interdisciplinary collaboration in biomedical text mining research. These challenges provide platforms for researchers to develop state-of-the-art solutions for data mining and information processing in biomedical research. In this article, we review the recent advances in community challenges specific to Chinese biomedical text mining. Firstly, we collect the information of these evaluation tasks, such as data sources and task types. Secondly, we conduct systematic summary and comparative analysis, including named entity recognition, entity normalization, attribute extraction, relation extraction, event extraction, text classification, text similarity, knowledge graph construction, question answering, text generation, and large language model evaluation. Then, we summarize the potential clinical applications of these community challenge tasks from translational informatics perspective. Finally, we discuss the contributions and limitations of these community challenges, while highlighting future directions in the era of large language models.

Related papers

COMET: Benchmark for Comprehensive Biological Multi-omics Evaluation Tasks and Language Models [56.81513758682858]
COMET aims to evaluate models across single-omics, cross-omics, and multi-omics tasks. First, we curate and develop a diverse collection of downstream tasks and datasets covering key structural and functional aspects in DNA, RNA, and proteins. Then, we evaluate existing foundational language models for DNA, RNA, and proteins, as well as the newly proposed multi-omics method.
arXiv Detail & Related papers (2024-12-13T18:42:00Z)
Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges [7.140449861888235]
This review categorizes works in text generation into five main tasks. For each task, we review their relevant characteristics, sub-tasks, and specific challenges. Our investigation shows nine prominent challenges common to all tasks and sub-tasks in recent text generation publications.
arXiv Detail & Related papers (2024-05-24T14:38:11Z)
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data [41.8344712915454]
Automatic radiology report generation can alleviate the workload for physicians and minimize regional disparities in medical resources. It is a challenging task, as the computational model needs to mimic physicians to obtain information from multi-modal input data. Recent works emerged to address this issue using deep learning-based methods, such as transformers, contrastive learning, and knowledge-base construction. This survey summarizes the key techniques developed in the most recent works and proposes a general workflow for deep learning-based report generation.
arXiv Detail & Related papers (2024-05-21T14:37:35Z)
Computational analysis of the language of pain: a systematic review [0.19999259391104385]
This study aims to systematically review the literature on the computational processing of the language of pain. Data extraction and synthesis were performed to categorize selected studies according to their primary purpose and outcome.
arXiv Detail & Related papers (2024-04-24T21:59:40Z)
An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks. These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems. Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z)
An Analysis on Large Language Models in Healthcare: A Case Study of BioBERT [0.0]
This paper conducts a comprehensive investigation into applying large language models, particularly on BioBERT, in healthcare. The analysis outlines a systematic methodology for fine-tuning BioBERT to meet the unique needs of the healthcare domain. The paper thoroughly examines ethical considerations, particularly patient privacy and data security.
arXiv Detail & Related papers (2023-10-11T08:16:35Z)
BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types. Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z)
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark [51.38557174322772]
We present the first Chinese Biomedical Language Understanding Evaluation benchmark. It is a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification. We report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.
arXiv Detail & Related papers (2021-06-15T12:25:30Z)
Machine Learning Applications for Therapeutic Tasks with Genomics Data [49.98249191161107]
We review the literature on machine learning applications for genomics through the lens of therapeutic development. We identify twenty-two machine learning in genomics applications across the entire therapeutics pipeline. We pinpoint seven important challenges in this field with opportunities for expansion and impact.
arXiv Detail & Related papers (2021-05-03T21:20:20Z)
Automated Lay Language Summarization of Biomedical Scientific Reviews [16.01452242066412]
Health literacy has emerged as a crucial factor in making appropriate health decisions and ensuring treatment outcomes. Medical jargon and the complex structure of professional language in this domain make health information especially hard to interpret. This paper introduces the novel task of automated generation of lay language summaries of biomedical scientific reviews.
arXiv Detail & Related papers (2020-12-23T10:01:18Z)
Positioning yourself in the maze of Neural Text Generation: A Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc. We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z)
Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering. The main challenges that can be formulated as ML problems are classified into the three main categories. For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.