Related papers: Large Language Models are In-Context Molecule Learners

Large Language Models are In-Context Molecule Learners

URL: http://arxiv.org/abs/2403.04197v2
Date: Tue, 16 Apr 2024 05:07:52 GMT
Title: Large Language Models are In-Context Molecule Learners
Authors: Jiatong Li, Wei Liu, Zhihao Ding, Wenqi Fan, Yuqiang Li, Qing Li,
Abstract summary: We propose In-Context Molecule Adaptation (ICMA), as a new paradigm allowing LLMs to learn the molecule-text alignment from context examples. ICMA incorporates the following three stages: Hybrid Context Retrieval, Post-retrieval Re-ranking, and In-context Molecule Tuning. We show that ICMT can empower LLMs to achieve state-of-the-art or comparable performance without extra training corpora and intricate structures.
Score: 22.06735237464927
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large Language Models (LLMs) have demonstrated exceptional performance in biochemical tasks, especially the molecule caption translation task, which aims to bridge the gap between molecules and natural language texts. However, previous methods in adapting LLMs to the molecule-caption translation task required extra domain-specific pre-training stages, suffered weak alignment between molecular and textual spaces, or imposed stringent demands on the scale of LLMs. To resolve the challenges, we propose In-Context Molecule Adaptation (ICMA), as a new paradigm allowing LLMs to learn the molecule-text alignment from context examples via In-Context Molecule Tuning. Specifically, ICMA incorporates the following three stages: Hybrid Context Retrieval, Post-retrieval Re-ranking, and In-context Molecule Tuning. Initially, Hybrid Context Retrieval utilizes BM25 Caption Retrieval and Molecule Graph Retrieval to retrieve informative context examples. Additionally, we also propose Post-retrieval Re-ranking with Sequence Reversal and Random Walk to further improve the quality of retrieval results. Finally, In-Context Molecule Tuning unlocks the in-context molecule learning capability of LLMs with retrieved examples and adapts the parameters of LLMs for the molecule-caption translation task. Experimental results demonstrate that ICMT can empower LLMs to achieve state-of-the-art or comparable performance without extra training corpora and intricate structures, showing that LLMs are inherently in-context molecule learners.

Related papers

MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts [23.53304253421472]
MolReFlect is a teacher-student framework designed to contextually perform the molecule-caption alignments in a fine-grained way. Our experimental results demonstrate that MolReFlect enables LLMs like Mistral-7B to significantly outperform the previous baselines.
arXiv Detail & Related papers (2024-11-22T04:28:56Z)
MolCap-Arena: A Comprehensive Captioning Benchmark on Language-Enhanced Molecular Property Prediction [44.27112553103388]
We present Molecule Caption Arena: the first comprehensive benchmark of large language models (LLMs)augmented molecular property prediction. We evaluate over twenty LLMs, including both general-purpose and domain-specific molecule captioners, across diverse prediction tasks. Our findings confirm the ability of LLM-extracted knowledge to enhance state-of-the-art molecular representations.
arXiv Detail & Related papers (2024-11-01T17:03:16Z)
LLaVA-KD: A Framework of Distilling Multimodal Large Language Models [70.19607283302712]
We propose a novel framework to transfer knowledge from l-MLLM to s-MLLM. Specifically, we introduce Multimodal Distillation (MDist) to minimize the divergence between the visual-textual output distributions of l-MLLM and s-MLLM. We also propose a three-stage training scheme to fully exploit the potential of s-MLLM.
arXiv Detail & Related papers (2024-10-21T17:41:28Z)
Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model [77.50732023411811]
We propose a text-guided multi-property molecular optimization method utilizing transformer-based diffusion language model (TransDLM) TransDLM leverages standardized chemical nomenclature as semantic representations of molecules and implicitly embeds property requirements into textual descriptions. Our approach surpasses state-of-the-art methods in optimizing molecular structural similarity and enhancing chemical properties on the benchmark dataset.
arXiv Detail & Related papers (2024-10-17T14:30:27Z)
Many-Shot In-Context Learning for Molecular Inverse Design [56.65345962071059]
Large Language Models (LLMs) have demonstrated great performance in few-shot In-Context Learning (ICL) We develop a new semi-supervised learning method that overcomes the lack of experimental data available for many-shot ICL. As we show, the new method greatly improves upon existing ICL methods for molecular design while being accessible and easy to use for scientists.
arXiv Detail & Related papers (2024-07-26T21:10:50Z)
MolX: Enhancing Large Language Models for Molecular Learning with A Multi-Modal Extension [34.586861881519134]
Large Language Models (LLMs) with their strong task-handling capabilities have shown remarkable advancements across a spectrum of fields. This study seeks to enhance the ability of LLMs to comprehend molecules by equipping them with a multi-modal external module, namely MolX. In particular, instead of directly using a SMILES string to represent a molecule, we utilize specific encoders to extract fine-grained features from both SMILES string and 2D molecular graph representations.
arXiv Detail & Related papers (2024-06-10T20:25:18Z)
Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL [78.80673954827773]
Large Language Models (LLMs) play a crucial role in capturing structured semantics to enhance language understanding, improve interpretability, and reduce bias. We propose using Semantic Role Labeling (SRL) as a fundamental task to explore LLMs' ability to extract structured semantics. We find interesting potential: LLMs can indeed capture semantic structures, and scaling-up doesn't always mirror potential. We are surprised to discover that significant overlap in the errors is made by both LLMs and untrained humans, accounting for almost 30% of all errors.
arXiv Detail & Related papers (2024-05-10T11:44:05Z)
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora. We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs. Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z)
Can Large Language Models Empower Molecular Property Prediction? [16.5246941211725]
Molecular property prediction has gained significant attention due to its transformative potential in scientific disciplines. Recently, the rapid development of Large Language Models (LLMs) has revolutionized the field of NLP. In this work, we advance towards this objective through two perspectives: zero/few-shot molecular classification, and using the new explanations generated by LLMs as representations of molecules.
arXiv Detail & Related papers (2023-07-14T16:06:42Z)
Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective [53.300288393173204]
Large Language Models (LLMs) have shown remarkable performance in various cross-modal tasks. In this work, we propose an In-context Few-Shot Molecule Learning paradigm for molecule-caption translation. We evaluate the effectiveness of MolReGPT on molecule-caption translation, including molecule understanding and text-based molecule generation.
arXiv Detail & Related papers (2023-06-11T08:16:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.