Unifying Polymer Modeling and Design via a Conformation-Centric Generative Foundation Model
- URL: http://arxiv.org/abs/2510.16023v1
- Date: Wed, 15 Oct 2025 17:11:44 GMT
- Title: Unifying Polymer Modeling and Design via a Conformation-Centric Generative Foundation Model
- Authors: Fanmeng Wang, Shan Mei, Wentao Guo, Hongshuai Wang, Qi Ou, Zhifeng Gao, Hongteng Xu,
- Abstract summary: PolyConFM is a polymer foundation model that unifies polymer modeling and design through conformation-centric pretraining.<n>We construct the first high-quality polymer conformation dataset via molecular dynamics simulations to mitigate data sparsity.<n>Experiments demonstrate that PolyConFM consistently outperforms representative task-specific methods on diverse downstream tasks.
- Score: 29.414571977709098
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Polymers, macromolecules formed from covalently bonded monomers, underpin countless technologies and are indispensable to modern life. While deep learning is advancing polymer science, existing methods typically represent the whole polymer solely through monomer-level descriptors, overlooking the global structural information inherent in polymer conformations, which ultimately limits their practical performance. Moreover, this field still lacks a universal foundation model that can effectively support diverse downstream tasks, thereby severely constraining progress. To address these challenges, we introduce PolyConFM, the first polymer foundation model that unifies polymer modeling and design through conformation-centric generative pretraining. Recognizing that each polymer conformation can be decomposed into a sequence of local conformations (i.e., those of its repeating units), we pretrain PolyConFM under the conditional generation paradigm, reconstructing these local conformations via masked autoregressive (MAR) modeling and further generating their orientation transformations to recover the corresponding polymer conformation. Besides, we construct the first high-quality polymer conformation dataset via molecular dynamics simulations to mitigate data sparsity, thereby enabling conformation-centric pretraining. Experiments demonstrate that PolyConFM consistently outperforms representative task-specific methods on diverse downstream tasks, equipping polymer science with a universal and powerful tool.
Related papers
- PolySet: Restoring the Statistical Ensemble Nature of Polymers for Machine Learning [0.0]
We introduce PolySet, a framework that represents a polymer as a finite, weighted ensemble of chains sampled from an assumed molar-mass distribution.<n>By explicitly acknowledging the statistical nature of polymer matter, PolySet establishes a physically grounded foundation for future polymer machine learning.
arXiv Detail & Related papers (2025-12-15T10:50:48Z) - Foundation Models for Discovery and Exploration in Chemical Space [57.97784111110166]
MIST is a family of molecular foundation models trained on large unlabeled datasets.<n>We demonstrate the ability of these models to solve real-world problems across chemical space.
arXiv Detail & Related papers (2025-10-20T17:56:01Z) - polyGen: A Learning Framework for Atomic-level Polymer Structure Generation [4.6516580885528835]
We introduce polyGen, the first generative model designed specifically for polymer structures from minimal inputs such as repeat unit chemistry alone.<n> polyGen overcomes the limitations of traditional crystal structure prediction methods for polymers, successfully generating realistic and diverse linear and branched conformations.<n>As the first atomic-level proof-of-concept capturing intrinsic polymer flexibility, it marks a new capability in material structure generation.
arXiv Detail & Related papers (2025-04-24T15:26:00Z) - PolyConf: Unlocking Polymer Conformation Generation through Hierarchical Generative Models [28.480039088875635]
PolyConf is a pioneering tailored polymer conformation generation method.<n>We decompose the polymer conformation into a series of local conformations, generating these local conformations through an autoregressive model.<n>We then generate their orientation transformations via a diffusion model to assemble them into the complete polymer conformation.
arXiv Detail & Related papers (2025-04-11T07:12:02Z) - Multimodal machine learning with large language embedding model for polymer property prediction [2.525624865489335]
We propose a simple yet effective multimodal architecture, PolyLLMem, for polymer properties prediction tasks.<n>PolyLLMem integrates text embeddings generated by Llama 3 with molecular structure embeddings derived from Uni-Mol.<n>Its performance is comparable to, and in some cases exceeds, that of graph-based models, as well as transformer-based models.
arXiv Detail & Related papers (2025-03-29T03:48:11Z) - GeoMFormer: A General Architecture for Geometric Molecular Representation Learning [84.02083170392764]
We introduce a novel Transformer-based molecular model called GeoMFormer to achieve this goal.
We show that GeoMFormer achieves strong performance on both invariant and equivariant tasks of different types and scales.
arXiv Detail & Related papers (2024-06-24T17:58:13Z) - UniIF: Unified Molecule Inverse Folding [67.60267592514381]
We propose a unified model UniIF for inverse folding of all molecules.
Our proposed method surpasses state-of-the-art methods on all tasks.
arXiv Detail & Related papers (2024-05-29T10:26:16Z) - E(3)-equivariant models cannot learn chirality: Field-based molecular generation [51.327048911864885]
Chirality plays a key role in determining drug safety and potency.<n>We introduce a novel field-based representation, proposing reference rotations that replace rotational symmetry constraints.<n>The proposed model captures all molecular geometries including chirality, while still achieving highly competitive performance with E(3)-based methods across standard benchmarking metrics.
arXiv Detail & Related papers (2024-02-24T17:13:58Z) - TransPolymer: a Transformer-based language model for polymer property
predictions [9.04563945965023]
TransPolymer is a Transformer-based language model for polymer property prediction.
Our proposed polymer tokenizer with chemical awareness enables learning representations from polymer sequences.
arXiv Detail & Related papers (2022-09-03T01:29:59Z) - A graph representation of molecular ensembles for polymer property
prediction [3.032184156362992]
In contrast to organic molecules, polymers are often not well-defined single structures but an ensemble of similar molecules.
We introduce a graph representation of molecular ensembles and an associated graph neural network architecture that is tailored to polymer property prediction.
arXiv Detail & Related papers (2022-05-17T20:31:43Z) - Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data.
Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z) - Geometric Transformer for End-to-End Molecule Properties Prediction [92.28929858529679]
We introduce a Transformer-based architecture for molecule property prediction, which is able to capture the geometry of the molecule.
We modify the classical positional encoder by an initial encoding of the molecule geometry, as well as a learned gated self-attention mechanism.
arXiv Detail & Related papers (2021-10-26T14:14:40Z) - Learning Neural Generative Dynamics for Molecular Conformation
Generation [89.03173504444415]
We study how to generate molecule conformations (textiti.e., 3D structures) from a molecular graph.
We propose a novel probabilistic framework to generate valid and diverse conformations given a molecular graph.
arXiv Detail & Related papers (2021-02-20T03:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.