Related papers: Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review

Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review

URL: http://arxiv.org/abs/2403.15274v2
Date: Wed, 12 Jun 2024 15:50:31 GMT
Title: Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review
Authors: Jinge Wang, Zien Cheng, Qiuming Yao, Li Liu, Dong Xu, Gangqing Hu,
Abstract summary: The year 2023 marked a significant surge in the exploration of applying large language model (LLM) chatbots, notably ChatGPT, across various disciplines. We surveyed the applications of ChatGPT in bioinformatics and biomedical informatics throughout the year, covering omics, genetics, biomedical text mining, drug discovery, biomedical image understanding, bioinformatics programming, and bioinformatics education.
Score: 18.453228240650617
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The year 2023 marked a significant surge in the exploration of applying large language model (LLM) chatbots, notably ChatGPT, across various disciplines. We surveyed the applications of ChatGPT in bioinformatics and biomedical informatics throughout the year, covering omics, genetics, biomedical text mining, drug discovery, biomedical image understanding, bioinformatics programming, and bioinformatics education. Our survey delineates the current strengths and limitations of this chatbot in bioinformatics and offers insights into potential avenues for future developments.

Related papers

Large Language Models for Bioinformatics [58.892165394487414]
This survey focuses on the evolution, classification, and distinguishing features of bioinformatics-specific language models (BioLMs) We explore the wide-ranging applications of BioLMs in critical areas such as disease diagnosis, drug discovery, and vaccine development. We identify key challenges and limitations inherent in BioLMs, including data privacy and security concerns, interpretability issues, biases in training data and model outputs, and domain adaptation complexities.
arXiv Detail & Related papers (2025-01-10T01:43:05Z)
MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants [28.04215981636089]
We present MedMax, a large-scale multimodal biomedical instruction-tuning dataset for mixed-modal foundation models. With 1.47 million instances, MedMax encompasses a diverse range of tasks, including interleaved imagetext generation, biomedical image captioning and generation, visual chat, and report understanding. We fine-tune a mixed-modal foundation model on the MedMax dataset, achieving significant performance improvements.
arXiv Detail & Related papers (2024-12-17T08:30:00Z)
BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task Tuning [77.90250740041411]
This paper introduces BioT5+, an extension of the BioT5 framework, tailored to enhance biological research and drug discovery. BioT5+ incorporates several novel features: integration of IUPAC names for molecular understanding, inclusion of extensive bio-text and molecule data from sources like bioRxiv and PubChem, the multi-task instruction tuning for generality across tasks, and a numerical tokenization technique for improved processing of numerical data.
arXiv Detail & Related papers (2024-02-27T12:43:09Z)
An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks. These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems. Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z)
Diversifying Knowledge Enhancement of Biomedical Language Models using Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models. We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT. We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z)
BioLORD-2023: Semantic Textual Representations Fusing LLM and Clinical Knowledge Graph Insights [15.952942443163474]
We propose a new state-of-the-art approach for obtaining high-fidelity representations of biomedical concepts and sentences. We demonstrate consistent and substantial performance improvements over the previous state of the art. Besides our new state-of-the-art biomedical model for English, we also distill and release a multilingual model compatible with 50+ languages.
arXiv Detail & Related papers (2023-11-27T18:46:17Z)
Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health [22.858424132819795]
ChatGPT has led to the emergence of diverse applications in the field of biomedicine and health. We explore the areas of biomedical information retrieval, question answering, medical text summarization, and medical education. We find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods.
arXiv Detail & Related papers (2023-06-15T20:19:08Z)
Evaluation of ChatGPT on Biomedical Tasks: A Zero-Shot Comparison with Fine-Tuned Generative Transformers [2.5027382653219155]
ChatGPT is a large language model developed by OpenAI. This paper aims to evaluate the performance of ChatGPT on various benchmark biomedical tasks.
arXiv Detail & Related papers (2023-06-07T15:11:26Z)
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day [85.19963303642427]
We propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images. The model first learns to align biomedical vocabulary using the figure-caption pairs as is, then learns to master open-ended conversational semantics. This enables us to train a Large Language and Vision Assistant for BioMedicine in less than 15 hours (with eight A100s)
arXiv Detail & Related papers (2023-06-01T16:50:07Z)
BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining [140.61707108174247]
We propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature. We get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks respectively, and 78.2% accuracy on PubMedQA.
arXiv Detail & Related papers (2022-10-19T07:17:39Z)
Pre-trained Language Models in Biomedical Domain: A Systematic Survey [33.572502204216256]
Pre-trained language models (PLMs) have been the de facto paradigm for most natural language processing (NLP) tasks. This paper summarizes the recent progress of pre-trained language models in the biomedical domain and their applications in biomedical downstream tasks.
arXiv Detail & Related papers (2021-10-11T05:30:30Z)
Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature [67.4680600632232]
Self-supervised learning has emerged as a promising direction to overcome the annotation bottleneck. We propose a general approach for vertical search based on domain-specific pretraining. Our system can scale to tens of millions of articles on PubMed and has been deployed as Microsoft Biomedical Search.
arXiv Detail & Related papers (2021-06-25T01:02:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.