Related papers: Natural Language Processing Methods for the Study of Protein-Ligand Interactions

Natural Language Processing Methods for the Study of Protein-Ligand Interactions

URL: http://arxiv.org/abs/2409.13057v2
Date: Thu, 17 Oct 2024 16:56:34 GMT
Title: Natural Language Processing Methods for the Study of Protein-Ligand Interactions
Authors: James Michels, Ramya Bandarupalli, Amin Ahangar Akbari, Thai Le, Hong Xiao, Jing Li, Erik F. Y. Hom,
Abstract summary: Recent advances in Natural Language Processing have ignited interest in developing effective methods for predicting protein-ligand interactions. In this review, we explain where and how such approaches have been applied in the recent literature and discuss useful mechanisms such as short-term memory, transformers, and attention. We conclude with a discussion of the current limitations of NLP methods for the study of PLIs as well as key challenges that need to be addressed in future work.
Score: 8.165512093198934
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advances in Natural Language Processing (NLP) have ignited interest in developing effective methods for predicting protein-ligand interactions (PLIs) given their relevance to drug discovery and protein engineering efforts and the ever-growing volume of biochemical sequence and structural data available. The parallels between human languages and the "languages" used to represent proteins and ligands have enabled the use of NLP machine learning approaches to advance PLI studies. In this review, we explain where and how such approaches have been applied in the recent literature and discuss useful mechanisms such as long short-term memory, transformers, and attention. We conclude with a discussion of the current limitations of NLP methods for the study of PLIs as well as key challenges that need to be addressed in future work.

Related papers

Joint Masked Reconstruction and Contrastive Learning for Mining Interactions Between Proteins [4.254824555546419]
Protein-protein interaction (PPI) prediction is an instrumental means in elucidating the mechanisms underlying cellular operations. This paper introduces a novel PPI prediction method jointing masked reconstruction and contrastive learning, termed JmcPPI. Extensive experiments conducted on three widely utilized PPI datasets demonstrate that JmcPPI surpasses existing optimal baseline models.
arXiv Detail & Related papers (2025-03-06T17:39:12Z)
Biological Sequence with Language Model Prompting: A Survey [14.270959261105968]
Large Language models (LLMs) have emerged as powerful tools for addressing challenges across diverse domains. This paper systematically investigates the application of prompt-based methods with LLMs to biological sequences.
arXiv Detail & Related papers (2025-03-06T06:28:36Z)
Computational Protein Science in the Era of Large Language Models (LLMs) [54.35488233989787]
Computational protein science is dedicated to revealing knowledge and developing applications within the protein sequence-structure-function paradigm. Recently, Language Models (pLMs) have emerged as a milestone in AI due to their unprecedented language processing & generalization capability.
arXiv Detail & Related papers (2025-01-17T16:21:18Z)
Long-context Protein Language Model [76.95505296417866]
Self-supervised training of language models (LMs) has seen great success for protein sequences in learning meaningful representations and for generative drug design. Most protein LMs are based on the Transformer architecture trained on individual proteins with short context lengths. We propose LC-PLM based on an alternative protein LM architecture, BiMamba-S, built off selective structured state-space models. We also introduce its graph-contextual variant, LC-PLM-G, which contextualizes protein-protein interaction graphs for a second stage of training.
arXiv Detail & Related papers (2024-10-29T16:43:28Z)
ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases. Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions. We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z)
Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology. We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z)
Synergizing Machine Learning & Symbolic Methods: A Survey on Hybrid Approaches to Natural Language Processing [7.242609314791262]
We discuss the state-of-the-art hybrid approaches used for a broad spectrum of NLP tasks requiring natural language understanding, generation, and reasoning. Specifically, we delve into the state-of-the-art hybrid approaches used for a broad spectrum of NLP tasks requiring natural language understanding, generation, and reasoning.
arXiv Detail & Related papers (2024-01-22T14:24:03Z)
Exploring the Landscape of Natural Language Processing Research [3.3916160303055567]
Several NLP-related approaches have been surveyed in the research community. A comprehensive study that categorizes established topics, identifies trends, and outlines areas for future research remains absent. As a result, we present a structured overview of the research landscape, provide a taxonomy of fields of study in NLP, analyze recent developments in NLP, summarize our findings, and highlight directions for future work.
arXiv Detail & Related papers (2023-07-20T07:33:30Z)
Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge [6.244840529371179]
understanding protein interactions and pathway knowledge is crucial for unraveling the complexities of living systems. Existing databases provide curated biological data from literature and other sources, but their maintenance is labor-intensive. We propose to harness the capabilities of large language models to address these issues by automatically extracting such knowledge from the relevant scientific literature.
arXiv Detail & Related papers (2023-07-17T20:01:11Z)
Interactive Natural Language Processing [67.87925315773924]
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP. This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept.
arXiv Detail & Related papers (2023-05-22T17:18:29Z)
Meta Learning for Natural Language Processing: A Survey [88.58260839196019]
Deep learning has been the mainstream technique in natural language processing (NLP) area. Deep learning requires many labeled data and is less generalizable across domains. Meta-learning is an arising field in machine learning studying approaches to learn better algorithms.
arXiv Detail & Related papers (2022-05-03T13:58:38Z)
A Survey on Model Compression for Natural Language Processing [13.949219077548687]
Transformer is preventing NLP from entering broader scenarios including edge and mobile computing. Efficient NLP research aims to comprehensively consider computation, time and carbon emission for the entire life-cycle of NLP.
arXiv Detail & Related papers (2022-02-15T00:18:47Z)
Ensuring the Inclusive Use of Natural Language Processing in the Global Response to COVID-19 [58.720142291102135]
We discuss ways in which current and future NLP approaches can be made more inclusive by covering low-resource languages. We suggest several future directions for researchers interested in maximizing the positive societal impacts of NLP.
arXiv Detail & Related papers (2021-08-11T12:54:26Z)
Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery [0.5389800405902634]
Text-based representations of chemicals and proteins can be thought of as unstructured languages codified by humans to describe domain-specific knowledge. This review outlines the impact made by these advances on drug discovery and aims to further the dialogue between medicinal chemists and computer scientists.
arXiv Detail & Related papers (2020-02-10T21:02:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.