Related papers: Omni-Mol: Exploring Universal Convergent Space for Omni-Molecular Tasks

Omni-Mol: Exploring Universal Convergent Space for Omni-Molecular Tasks

URL: http://arxiv.org/abs/2502.01074v2
Date: Thu, 27 Feb 2025 06:55:12 GMT
Title: Omni-Mol: Exploring Universal Convergent Space for Omni-Molecular Tasks
Authors: Chengxin Hu, Hao Li, Yihe Yuan, Zezheng Song, Haixin Wang,
Abstract summary: Building generalist models has recently demonstrated remarkable capabilities in diverse scientific domains.<n>Conflicting molecular representations can lead to optimization difficulties for the models.<n>This paper presents Omni-Mol, a scalable and unified LLM-based framework for direct instruction tuning.
Score: 6.849025449303022
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Building generalist models has recently demonstrated remarkable capabilities in diverse scientific domains. Within the realm of molecular learning, several studies have explored unifying diverse tasks across diverse domains. However, negative conflicts and interference between molecules and knowledge from different domain may have a worse impact in threefold. First, conflicting molecular representations can lead to optimization difficulties for the models. Second, mixing and scaling up training data across diverse tasks is inherently challenging. Third, the computational cost of refined pretraining is prohibitively high. To address these limitations, this paper presents Omni-Mol, a scalable and unified LLM-based framework for direct instruction tuning. Omni-Mol builds on three key components to tackles conflicts: (1) a unified encoding mechanism for any task input; (2) an active-learning-driven data selection strategy that significantly reduces dataset size; (3) a novel design of the adaptive gradient stabilization module and anchor-and-reconcile MoE framework that ensures stable convergence. Experimentally, Omni-Mol achieves state-of-the-art performance across 15 molecular tasks, demonstrates the presence of scaling laws in the molecular domain, and is supported by extensive ablation studies and analyses validating the effectiveness of its design. The dataset, code and weights of the powerful AI-driven chemistry generalist are open-sourced.

Related papers

PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation [4.402280157389038]
We propose PharMolixFM, a unified framework for constructing all-atom foundation models. Our framework includes three variants using state-of-the-art multi-modal generative models. PharMolixFM-Diff achieves competitive prediction accuracy in protein-small-molecule docking.
arXiv Detail & Related papers (2025-03-12T12:53:43Z)
MAMMAL -- Molecular Aligned Multi-Modal Architecture and Language [0.4631438140637248]
MAMMAL is a versatile method applied to create a multi-task foundation model that learns from large-scale biological datasets across diverse modalities.<n> evaluated on eleven diverse downstream tasks, it reaches a new state of the art (SOTA) in nine tasks and is comparable to SOTA in two tasks.<n> explored Alphafold 3 binding prediction capabilities on antibody-antigen and nanobody-antigen complexes showing significantly better classification performance of MAMMAL in 3 out of 4 targets.
arXiv Detail & Related papers (2024-10-28T20:45:52Z)
Token-Mol 1.0: Tokenized drug design with large language model [10.258299488278514]
Token-Mol is a token-only 3D drug design model that encodes all molecular information, including 2D and 3D structures, as well as molecular property data, into tokens. It is built on the transformer decoder architecture and trained using random causal masking techniques. Compared to existing molecular pre-trained models, Token-Mol exhibits superior proficiency in handling a wider range of downstream tasks.
arXiv Detail & Related papers (2024-07-10T07:22:15Z)
UniIF: Unified Molecule Inverse Folding [67.60267592514381]
We propose a unified model UniIF for inverse folding of all molecules. Our proposed method surpasses state-of-the-art methods on all tasks.
arXiv Detail & Related papers (2024-05-29T10:26:16Z)
nach0: Multimodal Natural and Chemical Languages Foundation Model [7.815497069231599]
This paper introduces a new foundation model, nach0, capable of solving various chemical and biological tasks. nach0 is a multi-domain and multi-task encoder-decoder LLM pre-trained on unlabeled text from scientific literature, patents, and molecule strings.
arXiv Detail & Related papers (2023-11-21T07:56:30Z)
Learning Invariant Molecular Representation in Latent Discrete Space [52.13724532622099]
We propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shifts. Our model achieves stronger generalization against state-of-the-art baselines in the presence of various distribution shifts.
arXiv Detail & Related papers (2023-10-22T04:06:44Z)
Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks [44.934084652800976]
We introduce the first MoleculAR Conformer Ensemble Learning benchmark to thoroughly evaluate the potential of learning on conformer ensembles. Our findings reveal that direct learning from an conformer space can improve performance on a variety of tasks and models.
arXiv Detail & Related papers (2023-09-29T20:06:46Z)
Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance [47.42067405054353]
Multi-objective learning (MOL) problems often arise in emerging machine learning problems. One of the critical challenges in MOL is the potential conflict among different objectives during the iterative optimization process. Recent works have developed various dynamic weighting algorithms for MOL such as MGDA and its variants.
arXiv Detail & Related papers (2023-05-31T17:31:56Z)
Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction. Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations. On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
ASGN: An Active Semi-supervised Graph Neural Network for Molecular Property Prediction [61.33144688400446]
We propose a novel framework called Active Semi-supervised Graph Neural Network (ASGN) by incorporating both labeled and unlabeled molecules. In the teacher model, we propose a novel semi-supervised learning method to learn general representation that jointly exploits information from molecular structure and molecular distribution. At last, we proposed a novel active learning strategy in terms of molecular diversities to select informative data during the whole framework learning.
arXiv Detail & Related papers (2020-07-07T04:22:39Z)
Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization. We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise. We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.