Exploring Chemical Space using Natural Language Processing Methodologies
for Drug Discovery
- URL: http://arxiv.org/abs/2002.06053v1
- Date: Mon, 10 Feb 2020 21:02:05 GMT
- Title: Exploring Chemical Space using Natural Language Processing Methodologies
for Drug Discovery
- Authors: Hakime \"Ozt\"urk, Arzucan \"Ozg\"ur, Philippe Schwaller, Teodoro
Laino, Elif Ozkirimli
- Abstract summary: Text-based representations of chemicals and proteins can be thought of as unstructured languages codified by humans to describe domain-specific knowledge.
This review outlines the impact made by these advances on drug discovery and aims to further the dialogue between medicinal chemists and computer scientists.
- Score: 0.5389800405902634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-based representations of chemicals and proteins can be thought of as
unstructured languages codified by humans to describe domain-specific
knowledge. Advances in natural language processing (NLP) methodologies in the
processing of spoken languages accelerated the application of NLP to elucidate
hidden knowledge in textual representations of these biochemical entities and
then use it to construct models to predict molecular properties or to design
novel molecules. This review outlines the impact made by these advances on drug
discovery and aims to further the dialogue between medicinal chemists and
computer scientists.
Related papers
- Leveraging Biomolecule and Natural Language through Multi-Modal
Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology.
We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z) - Empirical Evidence for the Fragment level Understanding on Drug
Molecular Structure of LLMs [16.508471997999496]
We investigate whether and how language models understand the chemical spatial structure from 1D sequences.
The results indicate that language models can understand chemical structures from the perspective of molecular fragments.
arXiv Detail & Related papers (2024-01-15T12:53:58Z) - Language models in molecular discovery [2.874893537471256]
" scientific language models" operate on small molecules, proteins or polymers.
In chemistry, language models contribute to accelerating the molecule discovery cycle.
We highlight valuable open-source software assets thus lowering the entry barrier to the field of scientific language modeling.
arXiv Detail & Related papers (2023-09-28T08:19:54Z) - Interactive Molecular Discovery with Natural Language [69.89287960545903]
We propose the conversational molecular design, a novel task adopting natural language for describing and editing target molecules.
To better accomplish this task, we design ChatMol, a knowledgeable and versatile generative pre-trained model, enhanced by injecting experimental property information.
arXiv Detail & Related papers (2023-06-21T02:05:48Z) - Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective [53.300288393173204]
Large Language Models (LLMs) have shown remarkable performance in various cross-modal tasks.
In this work, we propose an In-context Few-Shot Molecule Learning paradigm for molecule-caption translation.
We evaluate the effectiveness of MolReGPT on molecule-caption translation, including molecule understanding and text-based molecule generation.
arXiv Detail & Related papers (2023-06-11T08:16:25Z) - Domain-Agnostic Molecular Generation with Chemical Feedback [44.063584808910896]
MolGen is a pre-trained molecular language model tailored specifically for molecule generation.
It internalizes structural and grammatical insights through the reconstruction of over 100 million molecular SELFIES.
Our chemical feedback paradigm steers the model away from molecular hallucinations, ensuring alignment between the model's estimated probabilities and real-world chemical preferences.
arXiv Detail & Related papers (2023-01-26T17:52:56Z) - Multi-modal Molecule Structure-text Model for Text-based Retrieval and
Editing [107.49804059269212]
We present a multi-modal molecule structure-text model, MoleculeSTM, by jointly learning molecules' chemical structures and textual descriptions.
In experiments, MoleculeSTM obtains the state-of-the-art generalization ability to novel biochemical concepts.
arXiv Detail & Related papers (2022-12-21T06:18:31Z) - A Molecular Multimodal Foundation Model Associating Molecule Graphs with
Natural Language [63.60376252491507]
We propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data.
We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine.
arXiv Detail & Related papers (2022-09-12T00:56:57Z) - Reinforced Molecular Optimization with Neighborhood-Controlled Grammars [63.84003497770347]
We propose MNCE-RL, a graph convolutional policy network for molecular optimization.
We extend the original neighborhood-controlled embedding grammars to make them applicable to molecular graph generation.
We show that our approach achieves state-of-the-art performance in a diverse range of molecular optimization tasks.
arXiv Detail & Related papers (2020-11-14T05:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.