Improving Large Molecular Language Model via Relation-aware Multimodal Collaboration
- URL: http://arxiv.org/abs/2601.12256v1
- Date: Sun, 18 Jan 2026 04:38:19 GMT
- Title: Improving Large Molecular Language Model via Relation-aware Multimodal Collaboration
- Authors: Jinyoung Park, Minseong Bae, Jeehye Na, Hyunwoo J. Kim,
- Abstract summary: We propose CoLLaMo, a large language model-based molecular assistant equipped with a multi-level molecular modality-collaborative projector.<n>Our experiments demonstrate that our CoLLaMo enhances the molecular modality generalization capabilities of LMLMs.
- Score: 34.099746438477816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have demonstrated their instruction-following capabilities and achieved powerful performance on various tasks. Inspired by their success, recent works in the molecular domain have led to the development of large molecular language models (LMLMs) that integrate 1D molecular strings or 2D molecular graphs into the language models. However, existing LMLMs often suffer from hallucination and limited robustness, largely due to inadequate integration of diverse molecular modalities such as 1D sequences, 2D molecular graphs, and 3D conformations. To address these limitations, we propose CoLLaMo, a large language model-based molecular assistant equipped with a multi-level molecular modality-collaborative projector. The relation-aware modality-collaborative attention mechanism in the projector facilitates fine-grained and relation-guided information exchange between atoms by incorporating 2D structural and 3D spatial relations. Furthermore, we present a molecule-centric new automatic measurement, including a hallucination assessment metric and GPT-based caption quality evaluation to address the limitations of token-based generic evaluation metrics (i.e., BLEU) widely used in assessing molecular comprehension of LMLMs. Our extensive experiments demonstrate that our CoLLaMo enhances the molecular modality generalization capabilities of LMLMs, achieving the best performance on multiple tasks, including molecule captioning, computed property QA, descriptive property QA, motif counting, and IUPAC name prediction.
Related papers
- KnowMol: Advancing Molecular Large Language Models with Multi-Level Chemical Knowledge [73.51130155601824]
We introduce KnowMol-100K, a large-scale dataset with 100K fine-grained molecular annotations across multiple levels.<n>We also propose chemically-informative molecular representation, effectively addressing limitations in existing molecular representation strategies.<n>KnowMol achieves superior performance across molecular understanding and generation tasks.
arXiv Detail & Related papers (2025-10-22T11:23:58Z) - $\text{M}^{2}$LLM: Multi-view Molecular Representation Learning with Large Language Models [59.125833618091846]
We propose a multi-view framework that integrates three perspectives: the molecular structure view, the molecular task view, and the molecular rules view.<n>Experiments demonstrate that $textM2$LLM achieves state-of-the-art performance on multiple benchmarks across classification and regression tasks.
arXiv Detail & Related papers (2025-08-12T05:46:47Z) - FARM: Functional Group-Aware Representations for Small Molecules [55.281754551202326]
We introduce Functional Group-Aware Representations for Small Molecules (FARM)<n>FARM is a novel model designed to bridge the gap between SMILES, natural language, and molecular graphs.<n>We evaluate FARM on the MoleculeNet dataset, where it achieves state-of-the-art performance on 11 out of 13 tasks.
arXiv Detail & Related papers (2024-10-02T23:04:58Z) - UniMoT: Unified Molecule-Text Language Model with Discrete Token Representation [35.277927005912275]
We introduce UniMoT, a Unified Molecule-Text LLM adopting a tokenizer-based architecture.<n>A Vector Quantization-driven tokenizer transforms molecules into sequences of molecule tokens with causal dependency.<n>UniMoT emerges as a multi-modal generalist capable of performing both molecule-to-text and text-to-molecule tasks.
arXiv Detail & Related papers (2024-08-01T18:31:31Z) - MolX: Enhancing Large Language Models for Molecular Understanding With A Multi-Modal Extension [44.97089022713424]
Large Language Models (LLMs) with their strong task-handling capabilities have shown remarkable advancements across a spectrum of fields.<n>This study seeks to enhance the ability of LLMs to comprehend molecules by equipping them with a multi-modal external module, termed MolX.<n>A hand-crafted molecular fingerprint is incorporated to leverage its embedded domain knowledge.
arXiv Detail & Related papers (2024-06-10T20:25:18Z) - Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model [49.64512917330373]
We introduce a multi-constraint molecular generation large language model, TSMMG, akin to a student.
To train TSMMG, we construct a large set of text-molecule pairs by extracting molecular knowledge from these 'teachers'
We experimentally show that TSMMG remarkably performs in generating molecules meeting complex, natural language-described property requirements.
arXiv Detail & Related papers (2024-03-20T02:15:55Z) - MolTC: Towards Molecular Relational Modeling In Language Models [28.960416816491392]
We propose a novel framework for Molecular inTeraction prediction following Chain-of-Thought (CoT) theory termed MolTC.
Our experiments, conducted across various datasets involving over 4,000,000 molecular pairs, exhibit the superiority of our method over current GNN and LLM-based baselines.
arXiv Detail & Related papers (2024-02-06T07:51:56Z) - Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks [44.934084652800976]
We introduce the first MoleculAR Conformer Ensemble Learning benchmark to thoroughly evaluate the potential of learning on conformer ensembles.
Our findings reveal that direct learning from an conformer space can improve performance on a variety of tasks and models.
arXiv Detail & Related papers (2023-09-29T20:06:46Z) - Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective [53.300288393173204]
Large Language Models (LLMs) have shown remarkable performance in various cross-modal tasks.
In this work, we propose an In-context Few-Shot Molecule Learning paradigm for molecule-caption translation.
We evaluate the effectiveness of MolReGPT on molecule-caption translation, including molecule understanding and text-based molecule generation.
arXiv Detail & Related papers (2023-06-11T08:16:25Z) - A Molecular Multimodal Foundation Model Associating Molecule Graphs with
Natural Language [63.60376252491507]
We propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data.
We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine.
arXiv Detail & Related papers (2022-09-12T00:56:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.