Modular Multi-Task Learning for Chemical Reaction Prediction
- URL: http://arxiv.org/abs/2602.10404v1
- Date: Wed, 11 Feb 2026 01:17:06 GMT
- Title: Modular Multi-Task Learning for Chemical Reaction Prediction
- Authors: Jiayun Pang, Ahmed M. Zaitoun, Xacobe Couso Cambeiro, Ivan Vulić,
- Abstract summary: Low-Rank Adaptation (LoRA) is a parameter-efficient alternative to full fine-tuning for organic reaction prediction.<n>LoRA achieves accuracy comparable to full fine-tuning while effectively mitigating catastrophic forgetting and better preserving multi-task performance.
- Score: 1.443416244644791
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adapting large language models (LLMs) trained on broad organic chemistry to smaller, domain-specific reaction datasets is a key challenge in chemical and pharmaceutical R&D. Effective specialisation requires learning new reaction knowledge while preserving general chemical understanding across related tasks. Here, we evaluate Low-Rank Adaptation (LoRA) as a parameter-efficient alternative to full fine-tuning for organic reaction prediction on limited, complex datasets. Using USPTO reaction classes and challenging C-H functionalisation reactions, we benchmark forward reaction prediction, retrosynthesis and reagent prediction. LoRA achieves accuracy comparable to full fine-tuning while effectively mitigating catastrophic forgetting and better preserving multi-task performance. Both fine-tuning approaches generalise beyond training distributions, producing plausible alternative solvent predictions. Notably, C-H functionalisation fine-tuning reveals that LoRA and full fine-tuning encode subtly different reactivity patterns, suggesting more effective reaction-specific adaptation with LoRA. As LLMs continue to scale, our results highlight the practicality of modular, parameter-efficient fine-tuning strategies for their flexible deployment for chemistry applications.
Related papers
- ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data [53.78763789036172]
We present ChemActor, a fully fine-tuned large language model (LLM) as a chemical executor to convert between unstructured experimental procedures and structured action sequences.<n>This framework integrates a data selection module that selects data based on distribution divergence, with a general-purpose LLM, to generate machine-executable actions from a single molecule input.<n>Experiments on reaction-to-description (R2D) and description-to-action (D2A) tasks demonstrate that ChemActor achieves state-of-the-art performance, outperforming the baseline model by 10%.
arXiv Detail & Related papers (2025-06-30T05:11:19Z) - Interpretable Deep Learning for Polar Mechanistic Reaction Prediction [43.95903801494905]
We introduce PMechRP (Polar Mechanistic Reaction Predictor), a system that trains machine learning models on the PMechDB dataset.<n>We train compare a range of machine learning models, including transformer-based, graph-based and two-step siamese architectures.<n>Our best-performing approach was a hybrid model, which combines a 5-ensemble of Chemformer models with a two-step Siamese framework.
arXiv Detail & Related papers (2025-04-22T02:31:23Z) - Chemical knowledge-informed framework for privacy-aware retrosynthesis learning [72.39098405805318]
Current machine learning-based retrosynthesis gathers reaction data from multiple sources into one single edge to train prediction models.<n>This paradigm poses considerable privacy risks as it necessitates broad data availability across organizational boundaries.<n>In the present study, we introduce the chemical knowledge-informed framework (CKIF), a privacy-preserving approach for learning retrosynthesis models.
arXiv Detail & Related papers (2025-02-26T13:13:24Z) - Learning Chemical Reaction Representation with Reactant-Product Alignment [50.28123475356234]
RAlign is a novel chemical reaction representation learning model for various organic reaction-related tasks.<n>By integrating atomic correspondence between reactants and products, our model discerns the molecular transformations that occur during the reaction.<n>We introduce a reaction-center-aware attention mechanism that enables the model to concentrate on key functional groups.
arXiv Detail & Related papers (2024-11-26T17:41:44Z) - log-RRIM: Yield Prediction via Local-to-global Reaction Representation Learning and Interaction Modeling [6.310759215182946]
log-RRIM is an innovative graph transformer-based framework designed for predicting chemical reaction yields.<n>A key feature of log-RRIM is its integration of a cross-attention mechanism that focuses on the interplay between reagents and reaction centers.<n>Log-RRIM shows superior performance in our experiments, especially for medium to high-yielding reactions.
arXiv Detail & Related papers (2024-10-20T18:35:56Z) - Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation [38.76977853056086]
Chemma-RC is a text-augmented multimodal LLM to identify effective conditions through task-specific dialogue and condition generation.<n>Chemma-RC learns a unified representation of chemical reactions by aligning multiple modalities-including text corpus, reaction SMILES, and reaction graphs-within a shared embedding module.<n>Performance benchmarking on datasets showed high precision in identifying optimal conditions, with up to 17% improvement over the current state-of-the-art methods.
arXiv Detail & Related papers (2024-07-21T12:27:26Z) - ReactAIvate: A Deep Learning Approach to Predicting Reaction Mechanisms and Unmasking Reactivity Hotspots [4.362338454684645]
We develop an interpretable attention-based GNN that achieved near-unity and 96% accuracy for reaction step classification.
Our model adeptly identifies key atom(s) even from out-of-distribution classes.
This generalizabilty allows for the inclusion of new reaction types in a modular fashion, thus will be of value to experts for understanding the reactivity of new molecules.
arXiv Detail & Related papers (2024-07-14T05:53:18Z) - A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions [24.80165173525286]
We introduce a data-curated self-feedback knowledge elicitation approach.
We employ adaptive prompt learning to infuse the prior knowledge into the large language model.
This research offers a novel paradigm for knowledge elicitation in scientific research.
arXiv Detail & Related papers (2024-04-15T09:26:33Z) - Retrosynthesis prediction enhanced by in-silico reaction data
augmentation [66.5643280109899]
We present RetroWISE, a framework that employs a base model inferred from real paired data to perform in-silico reaction generation and augmentation.
On three benchmark datasets, RetroWISE achieves the best overall performance against state-of-the-art models.
arXiv Detail & Related papers (2024-01-31T07:40:37Z) - ReLM: Leveraging Language Models for Enhanced Chemical Reaction
Prediction [26.342666819515774]
ReLM is a framework that leverages the chemical knowledge encoded in language models (LMs) to assist Graph Neural Networks (GNNs)
Our experimental results demonstrate that ReLM improves the performance of state-of-the-art GNN-based methods across various chemical reaction datasets.
arXiv Detail & Related papers (2023-10-20T15:33:23Z) - Improving Molecular Representation Learning with Metric
Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems.
MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z) - Unassisted Noise Reduction of Chemical Reaction Data Sets [59.127921057012564]
We propose a machine learning-based, unassisted approach to remove chemically wrong entries from data sets.
Our results show an improved prediction quality for models trained on the cleaned and balanced data sets.
arXiv Detail & Related papers (2021-02-02T09:34:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.