Predicting the Temperature-Dependent CMC of Surfactant Mixtures with Graph Neural Networks
- URL: http://arxiv.org/abs/2411.02224v2
- Date: Tue, 05 Nov 2024 08:12:42 GMT
- Title: Predicting the Temperature-Dependent CMC of Surfactant Mixtures with Graph Neural Networks
- Authors: Christoforos Brozos, Jan G. Rittig, Elie Akanny, Sandip Bhattacharya, Christina Kohlmann, Alexander Mitsos,
- Abstract summary: Surfactants are key ingredients in foaming and cleansing products across various industries.
In practice, surfactant mixtures are typically used due to to performance, environmental, and cost reasons.
We develop a graph neural network framework for surfactant mixtures to predict the temperature-dependent CMC.
- Score: 36.814181034608666
- License:
- Abstract: Surfactants are key ingredients in foaming and cleansing products across various industries such as personal and home care, industrial cleaning, and more, with the critical micelle concentration (CMC) being of major interest. Predictive models for CMC of pure surfactants have been developed based on recent ML methods, however, in practice surfactant mixtures are typically used due to to performance, environmental, and cost reasons. This requires accounting for synergistic/antagonistic interactions between surfactants; however, predictive ML models for a wide spectrum of mixtures are missing so far. Herein, we develop a graph neural network (GNN) framework for surfactant mixtures to predict the temperature-dependent CMC. We collect data for 108 surfactant binary mixtures, to which we add data for pure species from our previous work [Brozos et al. (2024), J. Chem. Theory Comput.]. We then develop and train GNNs and evaluate their accuracy across different prediction test scenarios for binary mixtures relevant to practical applications. The final GNN models demonstrate very high predictive performance when interpolating between different mixture compositions and for new binary mixtures with known species. Extrapolation to binary surfactant mixtures where either one or both surfactant species are not seen before, yields accurate results for the majority of surfactant systems. We further find superior accuracy of the GNN over a semi-empirical model based on activity coefficients, which has been widely used to date. We then explore if GNN models trained solely on binary mixture and pure species data can also accurately predict the CMCs of ternary mixtures. Finally, we experimentally measure the CMC of 4 commercial surfactants that contain up to four species and industrial relevant mixtures and find a very good agreement between measured and predicted CMC values.
Related papers
- Hierarchical Matrix Completion for the Prediction of Properties of Binary Mixtures [3.0478550046333965]
We introduce a novel generic approach for improving data-driven models.
We lump components that behave similarly into chemical classes and model them jointly.
Using clustering leads to significantly improved predictions compared to an MCM without clustering.
arXiv Detail & Related papers (2024-10-08T14:04:30Z) - Unifying Mixed Gas Adsorption in Molecular Sieve Membranes and MOFs using Machine Learning [0.0]
Recent machine learning models focus on polymers or metal-organic frameworks (MOFs) separately.
The difficulty in creating a unified model that can predict the trends in both types of adsorbents is challenging.
In this work, we address these problems using feature vectors comprising only the physical properties of the gas mixtures and adsorbents.
arXiv Detail & Related papers (2024-06-19T09:30:11Z) - Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance [55.872926690722714]
We study the predictability of model performance regarding the mixture proportions in function forms.
We propose nested use of the scaling laws of training steps, model sizes, and our data mixing law.
Our method effectively optimize the training mixture of a 1B model trained for 100B tokens in RedPajama.
arXiv Detail & Related papers (2024-03-25T17:14:00Z) - Predicting the Temperature Dependence of Surfactant CMCs Using Graph
Neural Networks [38.39977540117143]
classical QSPR and Graph Neural Networks (GNNs) have been successfully applied to predict the CMC of surfactants at room temperature.
We herein develop a GNN model for temperature-dependent CMC prediction of surfactants.
arXiv Detail & Related papers (2024-03-06T15:03:04Z) - Graph Neural Networks for Surfactant Multi-Property Prediction [38.39977540117143]
Graph Neural Networks (GNNs) have exhibited a great predictive performance for property prediction of ionic liquids, polymers and drugs in general.
We create the largest available CMC database with 429 molecules and the first large data collection for surface excess concentration.
GNN yields highly accurate predictions for CMC, showing great potential for future industrial applications.
arXiv Detail & Related papers (2024-01-03T18:32:25Z) - Graph Neural Networks for Temperature-Dependent Activity Coefficient
Prediction of Solutes in Ionic Liquids [58.720142291102135]
We present a GNN to predict temperature-dependent infinite dilution ACs of solutes in ILs.
We train the GNN on a database including more than 40,000 AC values and compare it to a state-of-the-art MCM.
The GNN and MCM achieve similar high prediction performance, with the GNN additionally enabling high-quality predictions for ACs of solutions that contain ILs and solutes not considered during training.
arXiv Detail & Related papers (2022-06-23T15:27:29Z) - On the Calibration of Pre-trained Language Models using Mixup Guided by
Area Under the Margin and Saliency [47.90235939359225]
We propose a novel mixup strategy for pre-trained language models that improves model calibration further.
Our method achieves the lowest expected calibration error compared to strong baselines on both in-domain and out-of-domain test samples.
arXiv Detail & Related papers (2022-03-14T23:45:08Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Assessing Graph-based Deep Learning Models for Predicting Flash Point [52.931492216239995]
Graph-based deep learning (GBDL) models were implemented in predicting flash point for the first time.
Average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3% lower and 2.0 K higher than previous comparable studies.
arXiv Detail & Related papers (2020-02-26T06:10:12Z) - Machine Learning in Thermodynamics: Prediction of Activity Coefficients
by Matrix Completion [34.7384528263504]
We propose a probabilistic matrix factorization model for predicting the activity coefficients in arbitrary binary mixtures.
Our method outperforms the state-of-the-art method that has been refined over three decades.
This opens perspectives to novel methods for predicting physico-chemical properties of binary mixtures.
arXiv Detail & Related papers (2020-01-29T03:16:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.