Related papers: Chemically-Informed Machine Learning Approach for Prediction of Reactivity Ratios in Radical Copolymerization

Chemically-Informed Machine Learning Approach for Prediction of Reactivity Ratios in Radical Copolymerization

URL: http://arxiv.org/abs/2512.19715v1
Date: Mon, 15 Dec 2025 17:32:06 GMT
Title: Chemically-Informed Machine Learning Approach for Prediction of Reactivity Ratios in Radical Copolymerization
Authors: Habibollah Safari, Mona Bavarian,
Abstract summary: We present a method that combines unsupervised learning with artificial neural networks to predict reactivity ratios in radical copolymerization.<n>This work demonstrates that unsupervised learning offers rapid chemical insight for exploratory analysis, while supervised learning provides the accuracy necessary for final design predictions.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Predicting monomer reactivity ratios is crucial for controlling monomer sequence distribution in copolymers and their properties. Traditional experimental methods of determining reactivity ratios are time-consuming and resource-intensive, while existing computational methods often struggle with accuracy or scalability. Here, we present a method that combines unsupervised learning with artificial neural networks to predict reactivity ratios in radical copolymerization. By applying spectral clustering to physicochemical features of monomers, we identified three distinct monomer groups with characteristic reactivity patterns. This computationally efficient clustering approach revealed specific monomer group interactions leading to different sequence arrangements, including alternating, random, block, and gradient copolymers, providing chemical insights for initial exploration. Building upon these insights, we trained artificial neural networks to achieve quantitative reactivity ratio predictions. We explored two integration strategies including direct feature concatenation, and cluster-specific training, which demonstrated performance enhancements for targeted chemical domains compared to general training with equivalent sample sizes. However, models utilizing complete datasets outperformed specialized models trained on focused subsets, revealing a fundamental trade-off between chemical specificity and data availability. This work demonstrates that unsupervised learning offers rapid chemical insight for exploratory analysis, while supervised learning provides the accuracy necessary for final design predictions, with optimal strategies depending on data availability and application requirements.

Related papers

Amortized Sampling with Transferable Normalizing Flows [65.48838168417564]
Prose is a transferable normalizing flow trained on a corpus of peptide molecular dynamics trajectories up to 8 residues in length.<n>We show that Prose is a proposal for a variety of sampling algorithms, finding a simple importance sampling-based finetuning procedure to achieve superior performance.<n>We open-source the Prose dataset to further stimulate research into amortized sampling methods and finetuning objectives.
arXiv Detail & Related papers (2025-08-25T16:28:18Z)
Detecting and Pruning Prominent but Detrimental Neurons in Large Language Models [68.57424628540907]
Large language models (LLMs) often develop learned mechanisms specialized to specific datasets.<n>We introduce a fine-tuning approach designed to enhance generalization by identifying and pruning neurons associated with dataset-specific mechanisms.<n>Our method employs Integrated Gradients to quantify each neuron's influence on high-confidence predictions, pinpointing those that disproportionately contribute to dataset-specific performance.
arXiv Detail & Related papers (2025-07-12T08:10:10Z)
Generating new coordination compounds via multireference simulations, genetic algorithms and machine learning: the case of Co(II) molecular magnets [41.94295877935867]
We propose a computational strategy able to accelerate the discovery of new coordination compounds with desired electronic and magnetic properties.<n>Our approach is based on a combination of high- throughput ab initio methods, genetic algorithms and machine learning.<n>We showcase the power of this approach by automatically generating new Co(II) mononuclear coordination compounds with record magnetic properties in a fraction of the time required by either experiments or brute-force ab initio approaches.
arXiv Detail & Related papers (2025-04-18T15:33:48Z)
Chemical knowledge-informed framework for privacy-aware retrosynthesis learning [72.39098405805318]
Current machine learning-based retrosynthesis gathers reaction data from multiple sources into one single edge to train prediction models.<n>This paradigm poses considerable privacy risks as it necessitates broad data availability across organizational boundaries.<n>In the present study, we introduce the chemical knowledge-informed framework (CKIF), a privacy-preserving approach for learning retrosynthesis models.
arXiv Detail & Related papers (2025-02-26T13:13:24Z)
Hierarchical Matrix Completion for the Prediction of Properties of Binary Mixtures [3.0478550046333965]
We introduce a novel generic approach for improving data-driven models. We lump components that behave similarly into chemical classes and model them jointly. Using clustering leads to significantly improved predictions compared to an MCM without clustering.
arXiv Detail & Related papers (2024-10-08T14:04:30Z)
Regularized Neural Ensemblers [55.15643209328513]
In this study, we explore employing regularized neural networks as ensemble methods.<n>Motivated by the risk of learning low-diversity ensembles, we propose regularizing the ensembling model by randomly dropping base model predictions.<n>We demonstrate this approach provides lower bounds for the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z)
Targeting the partition function of chemically disordered materials with a generative approach based on inverse variational autoencoders [0.0]
We propose a novel approach where generative machine learning is used to yield a representative set of configurations for accurate property evaluation. Our method employs a specific type of variational autoencoder with inverse roles for the encoder and decoder. We illustrate our approach by computing point-defect formation energies and concentrations in (U, Pu)O2 mixed-oxide fuels.
arXiv Detail & Related papers (2024-08-27T10:05:37Z)
ReactAIvate: A Deep Learning Approach to Predicting Reaction Mechanisms and Unmasking Reactivity Hotspots [4.362338454684645]
We develop an interpretable attention-based GNN that achieved near-unity and 96% accuracy for reaction step classification. Our model adeptly identifies key atom(s) even from out-of-distribution classes. This generalizabilty allows for the inclusion of new reaction types in a modular fashion, thus will be of value to experts for understanding the reactivity of new molecules.
arXiv Detail & Related papers (2024-07-14T05:53:18Z)
Balancing Molecular Information and Empirical Data in the Prediction of Physico-Chemical Properties [8.649679686652648]
We propose a general method for combining molecular descriptors with representation learning.<n>The proposed hybrid model exploits chemical structure information using graph neural networks.<n>It automatically detects cases where structure-based predictions are unreliable, in which case it corrects them by representation-learning based predictions.
arXiv Detail & Related papers (2024-06-12T10:51:00Z)
Revealing the Relationship Between Publication Bias and Chemical Reactivity with Contrastive Learning [13.299207805882755]
Training on 20,798 aryl halides in the CAS Content Collection$textTM$, spanning thousands of publications from 2010-2015, we demonstrate that the learned embeddings exhibit a correlation with physical organic reactivity descriptors.<n>This work not only presents a chemistry-specific machine learning training strategy to learn from data literature in a new way, but also represents a unique approach to uncover trends in chemical reactivity reflected by trends in substrate selection in publications.
arXiv Detail & Related papers (2024-02-19T02:21:20Z)
Improving Molecular Representation Learning with Metric Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems. MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z)
Explainable Deep Relational Networks for Predicting Compound-Protein Affinities and Contacts [80.69440684790925]
DeepRelations is a physics-inspired deep relational network with intrinsically explainable architecture. It shows superior interpretability to the state-of-the-art. It boosts the AUPRC of contact prediction 9.5, 16.9, 19.3 and 5.7-fold for the test, compound-unique, protein-unique, and both-unique sets.
arXiv Detail & Related papers (2019-12-29T00:14:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.