MMPolymer: A Multimodal Multitask Pretraining Framework for Polymer Property Prediction
- URL: http://arxiv.org/abs/2406.04727v2
- Date: Fri, 26 Jul 2024 13:24:41 GMT
- Title: MMPolymer: A Multimodal Multitask Pretraining Framework for Polymer Property Prediction
- Authors: Fanmeng Wang, Wentao Guo, Minjie Cheng, Shen Yuan, Hongteng Xu, Zhifeng Gao,
- Abstract summary: MMPolymer is a novel multitask pretraining framework incorporating polymer 1D sequential and 3D structural information.
MMPolymer achieves state-of-the-art performance in downstream property prediction tasks.
- Score: 24.975491375575224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Polymers are high-molecular-weight compounds constructed by the covalent bonding of numerous identical or similar monomers so that their 3D structures are complex yet exhibit unignorable regularity. Typically, the properties of a polymer, such as plasticity, conductivity, bio-compatibility, and so on, are highly correlated with its 3D structure. However, existing polymer property prediction methods heavily rely on the information learned from polymer SMILES sequences (P-SMILES strings) while ignoring crucial 3D structural information, resulting in sub-optimal performance. In this work, we propose MMPolymer, a novel multimodal multitask pretraining framework incorporating polymer 1D sequential and 3D structural information to encourage downstream polymer property prediction tasks. Besides, considering the scarcity of polymer 3D data, we further introduce the "Star Substitution" strategy to extract 3D structural information effectively. During pretraining, in addition to predicting masked tokens and recovering clear 3D coordinates, MMPolymer achieves the cross-modal alignment of latent representations. Then we further fine-tune the pretrained MMPolymer for downstream polymer property prediction tasks in the supervised learning paradigm. Experiments show that MMPolymer achieves state-of-the-art performance in downstream property prediction tasks. Moreover, given the pretrained MMPolymer, utilizing merely a single modality in the fine-tuning phase can also outperform existing methods, showcasing the exceptional capability of MMPolymer in polymer feature extraction and utilization.
Related papers
- Molecular topological deep learning for polymer property prediction [18.602659324026934]
We develop molecular topological deep learning (Mol-TDL) for polymer property analysis.
Mol-TDL incorporates both high-order interactions and multiscale properties into topological deep learning architecture.
arXiv Detail & Related papers (2024-10-07T05:44:02Z) - Automated 3D Pre-Training for Molecular Property Prediction [54.15788181794094]
We propose a novel 3D pre-training framework (dubbed 3D PGT)
It pre-trains a model on 3D molecular graphs, and then fine-tunes it on molecular graphs without 3D structures.
Extensive experiments on 2D molecular graphs are conducted to demonstrate the accuracy, efficiency and generalization ability of the proposed 3D PGT.
arXiv Detail & Related papers (2023-06-13T14:43:13Z) - Photonic Quantum Computing For Polymer Classification [62.997667081978825]
Two polymer classes visual (VIS) and near-infrared (NIR) are defined based on the size of the polymer gaps.
We present a hybrid classical-quantum approach to the binary classification of polymer structures.
arXiv Detail & Related papers (2022-11-22T11:59:52Z) - TransPolymer: a Transformer-based language model for polymer property
predictions [9.04563945965023]
TransPolymer is a Transformer-based language model for polymer property prediction.
Our proposed polymer tokenizer with chemical awareness enables learning representations from polymer sequences.
arXiv Detail & Related papers (2022-09-03T01:29:59Z) - Representing Polymers as Periodic Graphs with Learned Descriptors for
Accurate Polymer Property Predictions [16.468017785818198]
We develop a periodic polymer graph representation that consistently outperforms hand-designed representations.
We also demonstrate how combining polymer graph representations with message-passing neural network architectures can automatically extract meaningful polymer features.
arXiv Detail & Related papers (2022-05-27T04:14:12Z) - A graph representation of molecular ensembles for polymer property
prediction [3.032184156362992]
In contrast to organic molecules, polymers are often not well-defined single structures but an ensemble of similar molecules.
We introduce a graph representation of molecular ensembles and an associated graph neural network architecture that is tailored to polymer property prediction.
arXiv Detail & Related papers (2022-05-17T20:31:43Z) - Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers [124.01928050651466]
We propose a new type of polyp segmentation method, named Polyp-PVT.
The proposed model, named Polyp-PVT, effectively suppresses noises in the features and significantly improves their expressive capabilities.
arXiv Detail & Related papers (2021-08-16T07:09:06Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z) - BIGDML: Towards Exact Machine Learning Force Fields for Materials [55.944221055171276]
Machine-learning force fields (MLFF) should be accurate, computationally and data efficient, and applicable to molecules, materials, and interfaces thereof.
Here, we introduce the Bravais-Inspired Gradient-Domain Machine Learning approach and demonstrate its ability to construct reliable force fields using a training set with just 10-200 atoms.
arXiv Detail & Related papers (2021-06-08T10:14:57Z) - Copolymer Informatics with Multi-Task Deep Neural Networks [0.0]
We address the property prediction challenge for copolymers, extending the polymer informatics framework beyond homopolymers.
A large data set containing over 18,000 data points of glass transition, melting, and degradation temperature of homopolymers and copolymers of up to two monomers is used.
The developed models are accurate, fast, flexible, and scalable to more copolymer properties when suitable data become available.
arXiv Detail & Related papers (2021-03-25T23:28:20Z) - Polymers for Extreme Conditions Designed Using Syntax-Directed
Variational Autoencoders [53.34780987686359]
Machine learning tools are now commonly employed to virtually screen material candidates with desired properties.
This approach is inefficient, and severely constrained by the candidates that human imagination can conceive.
We utilize syntax-directed variational autoencoders (VAE) in tandem with Gaussian process regression (GPR) models to discover polymers expected to be robust under three extreme conditions.
arXiv Detail & Related papers (2020-11-04T21:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.