MatterTune: An Integrated, User-Friendly Platform for Fine-Tuning Atomistic Foundation Models to Accelerate Materials Simulation and Discovery
- URL: http://arxiv.org/abs/2504.10655v1
- Date: Mon, 14 Apr 2025 19:12:43 GMT
- Title: MatterTune: An Integrated, User-Friendly Platform for Fine-Tuning Atomistic Foundation Models to Accelerate Materials Simulation and Discovery
- Authors: Lingyu Kong, Nima Shoghi, Guoxiang Hu, Pan Li, Victor Fung,
- Abstract summary: We introduce MatterTune, a framework that provides advanced fine-tuning capabilities and seamless integration of atomistic foundation models into downstream materials informatics and simulation.<n> MatterTune supports a number of state-of-the-art foundation models such as ORB, MatterSim, JMP, and EquformerV2.
- Score: 7.1240120153291535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Geometric machine learning models such as graph neural networks have achieved remarkable success in recent years in chemical and materials science research for applications such as high-throughput virtual screening and atomistic simulations. The success of these models can be attributed to their ability to effectively learn latent representations of atomic structures directly from the training data. Conversely, this also results in high data requirements for these models, hindering their application to problems which are data sparse which are common in this domain. To address this limitation, there is a growing development in the area of pre-trained machine learning models which have learned general, fundamental, geometric relationships in atomistic data, and which can then be fine-tuned to much smaller application-specific datasets. In particular, models which are pre-trained on diverse, large-scale atomistic datasets have shown impressive generalizability and flexibility to downstream applications, and are increasingly referred to as atomistic foundation models. To leverage the untapped potential of these foundation models, we introduce MatterTune, a modular and extensible framework that provides advanced fine-tuning capabilities and seamless integration of atomistic foundation models into downstream materials informatics and simulation workflows, thereby lowering the barriers to adoption and facilitating diverse applications in materials science. In its current state, MatterTune supports a number of state-of-the-art foundation models such as ORB, MatterSim, JMP, and EquformerV2, and hosts a wide range of features including a modular and flexible design, distributed and customizable fine-tuning, broad support for downstream informatics tasks, and more.
Related papers
- Scaling Laws of Graph Neural Networks for Atomistic Materials Modeling [11.61369154220932]
Atomistic materials modeling is a critical task with wide-ranging applications, from drug discovery to materials science.<n>Graph Neural Networks (GNNs) represent the state-of-the-art approach for modeling atomistic material data.<n>GNNs for atomistic materials modeling remain relatively small compared to large language models (LLMs), which leverage billions of parameters and terabyte-scale datasets.
arXiv Detail & Related papers (2025-04-10T20:19:20Z) - A preliminary data fusion study to assess the feasibility of Foundation Process-Property Models in Laser Powder Bed Fusion [0.0]
A major challenge that impedes the construction of foundation process-property models is data scarcity.<n>We generate experimental datasets from 17-4 PH and 316L stainless steels (SSs) in Laser Powder Bed Fusion (LPBF)<n>We then leverage Gaussian processes (GPs) for process-property modeling in various configurations to test if knowledge about one material system or property can be leveraged to build more accurate machine learning models for other material systems or properties.
arXiv Detail & Related papers (2025-03-20T19:29:38Z) - SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation [81.36747103102459]
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications.<n>Current state-of-the-art methods focus on training innovative architectural designs on confined datasets.<n>We investigate the impact of scaling up EHPS towards a family of generalist foundation models.
arXiv Detail & Related papers (2025-01-16T18:59:46Z) - On Foundation Models for Dynamical Systems from Purely Synthetic Data [5.004576576202551]
Foundation models have demonstrated remarkable generalization, data efficiency, and robustness properties across various domains.<n>These models are available in fields like natural language processing and computer vision, but do not exist for dynamical systems.<n>We address this challenge by pretraining a transformer-based foundation model exclusively on synthetic data.<n>Our results demonstrate the feasibility of foundation models for dynamical systems that outperform specialist models in terms of generalization, data efficiency, and robustness.
arXiv Detail & Related papers (2024-11-30T08:34:10Z) - Scalable Diffusion for Materials Generation [99.71001883652211]
We develop a unified crystal representation that can represent any crystal structure (UniMat)
UniMat can generate high fidelity crystal structures from larger and more complex chemical systems.
We propose additional metrics for evaluating generative models of materials.
arXiv Detail & Related papers (2023-10-18T15:49:39Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Calibrating constitutive models with full-field data via physics
informed neural networks [0.0]
We propose a physics-informed deep-learning framework for the discovery of model parameterizations given full-field displacement data.
We work with the weak form of the governing equations rather than the strong form to impose physical constraints upon the neural network predictions.
We demonstrate that informed machine learning is an enabling technology and may shift the paradigm of how full-field experimental data is utilized to calibrate models.
arXiv Detail & Related papers (2022-03-30T18:07:44Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Physics-Integrated Variational Autoencoders for Robust and Interpretable
Generative Modeling [86.9726984929758]
We focus on the integration of incomplete physics models into deep generative models.
We propose a VAE architecture in which a part of the latent space is grounded by physics.
We demonstrate generative performance improvements over a set of synthetic and real-world datasets.
arXiv Detail & Related papers (2021-02-25T20:28:52Z) - Gradient-Based Training and Pruning of Radial Basis Function Networks
with an Application in Materials Physics [0.24792948967354234]
We propose a gradient-based technique for training radial basis function networks with an efficient and scalable open-source implementation.
We derive novel closed-form optimization criteria for pruning the models for continuous as well as binary data.
arXiv Detail & Related papers (2020-04-06T11:32:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.