MOLLM: Multi-Objective Large Language Model for Molecular Design -- Optimizing with Experts
- URL: http://arxiv.org/abs/2502.12845v1
- Date: Tue, 18 Feb 2025 13:25:00 GMT
- Title: MOLLM: Multi-Objective Large Language Model for Molecular Design -- Optimizing with Experts
- Authors: Nian Ran, Yue Wang, Richard Allmendinger,
- Abstract summary: Molecular design plays a critical role in advancing fields such as drug discovery, materials science, and chemical engineering.
This work introduces the Multi-Objective Large Language Model for Molecular Design (MOLLM), a novel framework that combines domain-specific knowledge with the adaptability of Large Language Models.
- Score: 3.9194654197529784
- License:
- Abstract: Molecular design plays a critical role in advancing fields such as drug discovery, materials science, and chemical engineering. This work introduces the Multi-Objective Large Language Model for Molecular Design (MOLLM), a novel framework that combines domain-specific knowledge with the adaptability of Large Language Models to optimize molecular properties across multiple objectives. Leveraging in-context learning and multi-objective optimization, MOLLM achieves superior efficiency, innovation, and performance, significantly surpassing state-of-the-art (SOTA) methods. Recognizing the substantial impact of initial populations on evolutionary algorithms, we categorize them into three types: best initial, worst initial, and random initial, to ensure the initial molecules are the same for each method across experiments. Our results demonstrate that MOLLM consistently outperforms SOTA models in all of our experiments. We also provide extensive ablation studies to evaluate the superiority of our components.
Related papers
- Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging [111.8456671452411]
Multi-task learning (MTL) leverages a shared model to accomplish multiple tasks and facilitate knowledge transfer.
We propose a Weight-Ensembling Mixture of Experts (WEMoE) method for multi-task model merging.
We show that WEMoE and E-WEMoE outperform state-of-the-art (SOTA) model merging methods in terms of MTL performance, generalization, and robustness.
arXiv Detail & Related papers (2024-10-29T07:16:31Z) - Many-Shot In-Context Learning for Molecular Inverse Design [56.65345962071059]
Large Language Models (LLMs) have demonstrated great performance in few-shot In-Context Learning (ICL)
We develop a new semi-supervised learning method that overcomes the lack of experimental data available for many-shot ICL.
As we show, the new method greatly improves upon existing ICL methods for molecular design while being accessible and easy to use for scientists.
arXiv Detail & Related papers (2024-07-26T21:10:50Z) - Improving Molecular Modeling with Geometric GNNs: an Empirical Study [56.52346265722167]
This paper focuses on the impact of different canonicalization methods, (2) graph creation strategies, and (3) auxiliary tasks, on performance, scalability and symmetry enforcement.
Our findings and insights aim to guide researchers in selecting optimal modeling components for molecular modeling tasks.
arXiv Detail & Related papers (2024-07-11T09:04:12Z) - Efficient Evolutionary Search Over Chemical Space with Large Language Models [31.31899988523534]
optimization objectives can be non-differentiable.
We introduce chemistry-aware Large Language Models (LLMs) into evolutionary algorithms.
Our algorithm improves both the quality of the final solution and convergence speed.
arXiv Detail & Related papers (2024-06-23T06:22:49Z) - Latent Chemical Space Searching for Plug-in Multi-objective Molecule Generation [9.442146563809953]
We develop a versatile 'plug-in' molecular generation model that incorporates objectives related to target affinity, drug-likeness, and synthesizability.
We identify PSO-ENP as the optimal variant for multi-objective molecular generation and optimization.
arXiv Detail & Related papers (2024-04-10T02:37:24Z) - Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model [49.64512917330373]
We introduce a multi-constraint molecular generation large language model, TSMMG, akin to a student.
To train TSMMG, we construct a large set of text-molecule pairs by extracting molecular knowledge from these 'teachers'
We experimentally show that TSMMG remarkably performs in generating molecules meeting complex, natural language-described property requirements.
arXiv Detail & Related papers (2024-03-20T02:15:55Z) - A quantitative analysis of knowledge-learning preferences in large language models in molecular science [24.80165173525286]
Large language models (LLMs) introduce a fresh research paradigm to tackle scientific problems from a natural language processing (NLP) perspective.
LLMs significantly enhance our understanding and generation of molecules, often surpassing existing methods with their capabilities to decode and synthesize complex molecular patterns.
We propose a multi-modal benchmark, named ChEBI-20-MM, and perform 1263 experiments to assess the model's compatibility with data modalities and knowledge acquisition.
arXiv Detail & Related papers (2024-02-06T16:12:36Z) - Molecule optimization via multi-objective evolutionary in implicit
chemical space [8.72872397589296]
MOMO is a multi-objective molecule optimization framework to address the challenge by combining learning of chemical knowledge with multi-objective evolutionary search.
We demonstrate the high performance of MOMO on four multi-objective property and similarity optimization tasks, and illustrate the search capability of MOMO through case studies.
arXiv Detail & Related papers (2022-12-17T09:09:23Z) - A Molecular Multimodal Foundation Model Associating Molecule Graphs with
Natural Language [63.60376252491507]
We propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data.
We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine.
arXiv Detail & Related papers (2022-09-12T00:56:57Z) - Multi-Objective Latent Space Optimization of Generative Molecular Design Models [3.1996400013865656]
We propose a multi-objective latent space optimization (LSO) method that can significantly enhance the performance of generative molecular design (GMD)
We demonstrate that our multi-objective GMD LSO method can significantly improve the performance of GMD for jointly optimizing multiple molecular properties.
arXiv Detail & Related papers (2022-03-01T15:12:05Z) - Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data.
Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.