Semi-supervised teacher-student deep neural network for materials
discovery
- URL: http://arxiv.org/abs/2112.06142v1
- Date: Sun, 12 Dec 2021 04:00:21 GMT
- Title: Semi-supervised teacher-student deep neural network for materials
discovery
- Authors: Daniel Gleaves, Edirisuriya M. Dilanga Siriwardane, Yong Zhao, Nihang
Fu, Jianjun Hu
- Abstract summary: We propose a semi-supervised deep neural network (TSDNN) model for high-performance formation energy and synthesizability prediction.
For formation energy based stability screening, our model achieves an absolute 10.3% accuracy improvement compared to the baseline CGCNN regression model.
For synthesizability prediction, our model significantly increases the baseline PU learning's true positive rate from 87.9% to 97.9% using 1/49 model parameters.
- Score: 6.333015476935593
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Data driven generative machine learning models have recently emerged as one
of the most promising approaches for new materials discovery. While the
generator models can generate millions of candidates, it is critical to train
fast and accurate machine learning models to filter out stable, synthesizable
materials with desired properties. However, such efforts to build supervised
regression or classification screening models have been severely hindered by
the lack of unstable or unsynthesizable samples, which usually are not
collected and deposited in materials databases such as ICSD and Materials
Project (MP). At the same time, there are a significant amount of unlabelled
data available in these databases. Here we propose a semi-supervised deep
neural network (TSDNN) model for high-performance formation energy and
synthesizability prediction, which is achieved via its unique teacher-student
dual network architecture and its effective exploitation of the large amount of
unlabeled data. For formation energy based stability screening, our
semi-supervised classifier achieves an absolute 10.3\% accuracy improvement
compared to the baseline CGCNN regression model. For synthesizability
prediction, our model significantly increases the baseline PU learning's true
positive rate from 87.9\% to 97.9\% using 1/49 model parameters.
To further prove the effectiveness of our models, we combined our
TSDNN-energy and TSDNN-synthesizability models with our CubicGAN generator to
discover novel stable cubic structures. Out of 1000 recommended candidate
samples by our models, 512 of them have negative formation energies as
validated by our DFT formation energy calculations. Our experimental results
show that our semi-supervised deep neural networks can significantly improve
the screening accuracy in large-scale generative materials design.
Related papers
- An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data.
In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models.
We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Downstream Task-Oriented Generative Model Selections on Synthetic Data
Training for Fraud Detection Models [9.754400681589845]
In this paper, we approach the downstream task-oriented generative model selections problem in the case of training fraud detection models.
Our investigation supports that, while both Neural Network(NN)-based and Bayesian Network(BN)-based generative models are both good to complete synthetic training task under loose model interpretability constrain, the BN-based generative models is better than NN-based when synthetic training fraud detection model under strict model interpretability constrain.
arXiv Detail & Related papers (2024-01-01T23:33:56Z) - On the Stability of Iterative Retraining of Generative Models on their own Data [56.153542044045224]
We study the impact of training generative models on mixed datasets.
We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough.
We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-09-30T16:41:04Z) - An Adversarial Active Sampling-based Data Augmentation Framework for
Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable.
Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks.
We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z) - Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised
Convolutional Neural Networks [25.160063477248904]
A convolutional neural network model is developed for detecting atrial fibrillation from electrocardiogram signals.
The model demonstrates high performance despite being trained on limited, variable-length input data.
The final model achieved a 91.1% model compression ratio while maintaining high model accuracy of 91.7% and less than 1% loss.
arXiv Detail & Related papers (2022-06-14T11:47:04Z) - EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting.
We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z) - LCS: Learning Compressible Subspaces for Adaptive Network Compression at
Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models.
We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity.
Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z) - Mitigating severe over-parameterization in deep convolutional neural
networks through forced feature abstraction and compression with an
entropy-based heuristic [7.503338065129185]
We propose an Entropy-Based Convolutional Layer Estimation (EBCLE) which is robust and simple.
We present empirical evidence to emphasize the relative effectiveness of broader, yet shallower models trained using the EBCLE.
arXiv Detail & Related papers (2021-06-27T10:34:39Z) - On Energy-Based Models with Overparametrized Shallow Neural Networks [44.74000986284978]
Energy-based models (EBMs) are a powerful framework for generative modeling.
In this work we focus on shallow neural networks.
We show that models trained in the so-called "active" regime provide a statistical advantage over their associated "lazy" or kernel regime.
arXiv Detail & Related papers (2021-04-15T15:34:58Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.