Multi-Task Multi-Fidelity Learning of Properties for Energetic Materials
- URL: http://arxiv.org/abs/2408.14488v1
- Date: Wed, 21 Aug 2024 12:54:26 GMT
- Title: Multi-Task Multi-Fidelity Learning of Properties for Energetic Materials
- Authors: Robert J. Appleton, Daniel Klinger, Brian H. Lee, Michael Taylor, Sohee Kim, Samuel Blankenship, Brian C. Barnes, Steven F. Son, Alejandro Strachan,
- Abstract summary: We find that multi-task neural networks can learn from multi-modal data and outperform single-task models trained for specific properties.
As expected, the improvement is more significant for data-scarce properties.
This approach is widely applicable to fields outside energetic materials.
- Score: 34.8008617873679
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data science and artificial intelligence are playing an increasingly important role in the physical sciences. Unfortunately, in the field of energetic materials data scarcity limits the accuracy and even applicability of ML tools. To address data limitations, we compiled multi-modal data: both experimental and computational results for several properties. We find that multi-task neural networks can learn from multi-modal data and outperform single-task models trained for specific properties. As expected, the improvement is more significant for data-scarce properties. These models are trained using descriptors built from simple molecular information and can be readily applied for large-scale materials screening to explore multiple properties simultaneously. This approach is widely applicable to fields outside energetic materials.
Related papers
- Physical Consistency Bridges Heterogeneous Data in Molecular Multi-Task Learning [79.75718786477638]
We exploit the specialty of molecular tasks that there are physical laws connecting them, and design consistency training approaches.
We demonstrate that the more accurate energy data can improve the accuracy of structure prediction.
We also find that consistency training can directly leverage force and off-equilibrium structure data to improve structure prediction.
arXiv Detail & Related papers (2024-10-14T03:11:33Z) - MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding [59.41495657570397]
This dataset includes figures such as schematic diagrams, simulated images, macroscopic/microscopic photos, and experimental visualizations.
We developed benchmarks for scientific figure captioning and multiple-choice questions, evaluating six proprietary and over ten open-source models.
The dataset and benchmarks will be released to support further research.
arXiv Detail & Related papers (2024-07-06T00:40:53Z) - Multimodal Learning for Materials [7.167520424757711]
We introduce Multimodal Learning for Materials (MultiMat), which enables self-supervised multi-modality training of foundation models for materials.
We demonstrate our framework's potential using data from the Materials Project database on multiple axes.
arXiv Detail & Related papers (2023-11-30T18:35:29Z) - MatSciML: A Broad, Multi-Task Benchmark for Solid-State Materials
Modeling [7.142619575624596]
MatSci ML is a benchmark for modeling MATerials SCIence using Machine Learning (MatSci ML) methods.
MatSci ML provides a diverse set of materials systems and properties data for model training and evaluation.
In the multi-dataset learning setting, MatSci ML enables researchers to combine observations from multiple datasets to perform joint prediction of common properties.
arXiv Detail & Related papers (2023-09-12T03:08:37Z) - Multimodal machine learning for materials science: composition-structure
bimodal learning for experimentally measured properties [4.495968252019426]
This paper introduces a novel approach to multimodal machine learning in materials science via composition-structure bimodal learning.
The proposed COmposition-Structure Bimodal Network (COSNet) is designed to enhance learning and predictions of experimentally measured materials properties that have incomplete structure information.
arXiv Detail & Related papers (2023-08-04T02:04:52Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Tyger: Task-Type-Generic Active Learning for Molecular Property
Prediction [121.97742787439546]
How to accurately predict the properties of molecules is an essential problem in AI-driven drug discovery.
To reduce annotation cost, deep Active Learning methods are developed to select only the most representative and informative data for annotating.
We propose a Task-type-generic active learning framework (termed Tyger) that is able to handle different types of learning tasks in a unified manner.
arXiv Detail & Related papers (2022-05-23T12:56:12Z) - Audacity of huge: overcoming challenges of data scarcity and data
quality for machine learning in computational materials discovery [1.0036312061637764]
Machine learning (ML)-accelerated discovery requires large amounts of high-fidelity data to reveal predictive structure-property relationships.
For many properties of interest in materials discovery, the challenging nature and high cost of data generation has resulted in a data landscape that is scarcely populated and of dubious quality.
In the absence of manual curation, increasingly sophisticated natural language processing and automated image analysis are making it possible to learn structure-property relationships from the literature.
arXiv Detail & Related papers (2021-11-02T21:43:58Z) - Polymer Informatics with Multi-Task Learning [0.06524460254566902]
We show the potency of multi-task learning approaches that exploit inherent correlations effectively.
Data pertaining to 36 different properties of over $13, 000$ polymers are coalesced and supplied to deep-learning multi-task architectures.
The multi-task approach is accurate, efficient, scalable, and amenable to transfer learning as more data on the same or different properties become available.
arXiv Detail & Related papers (2020-10-28T18:28:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.