Related papers: Semi-supervised teacher-student deep neural network for materials discovery

Semi-supervised teacher-student deep neural network for materials discovery

URL: http://arxiv.org/abs/2112.06142v1
Date: Sun, 12 Dec 2021 04:00:21 GMT
Title: Semi-supervised teacher-student deep neural network for materials discovery
Authors: Daniel Gleaves, Edirisuriya M. Dilanga Siriwardane, Yong Zhao, Nihang Fu, Jianjun Hu
Abstract summary: We propose a semi-supervised deep neural network (TSDNN) model for high-performance formation energy and synthesizability prediction. For formation energy based stability screening, our model achieves an absolute 10.3% accuracy improvement compared to the baseline CGCNN regression model. For synthesizability prediction, our model significantly increases the baseline PU learning's true positive rate from 87.9% to 97.9% using 1/49 model parameters.
Score: 6.333015476935593
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Data driven generative machine learning models have recently emerged as one of the most promising approaches for new materials discovery. While the generator models can generate millions of candidates, it is critical to train fast and accurate machine learning models to filter out stable, synthesizable materials with desired properties. However, such efforts to build supervised regression or classification screening models have been severely hindered by the lack of unstable or unsynthesizable samples, which usually are not collected and deposited in materials databases such as ICSD and Materials Project (MP). At the same time, there are a significant amount of unlabelled data available in these databases. Here we propose a semi-supervised deep neural network (TSDNN) model for high-performance formation energy and synthesizability prediction, which is achieved via its unique teacher-student dual network architecture and its effective exploitation of the large amount of unlabeled data. For formation energy based stability screening, our semi-supervised classifier achieves an absolute 10.3\% accuracy improvement compared to the baseline CGCNN regression model. For synthesizability prediction, our model significantly increases the baseline PU learning's true positive rate from 87.9\% to 97.9\% using 1/49 model parameters. To further prove the effectiveness of our models, we combined our TSDNN-energy and TSDNN-synthesizability models with our CubicGAN generator to discover novel stable cubic structures. Out of 1000 recommended candidate samples by our models, 512 of them have negative formation energies as validated by our DFT formation energy calculations. Our experimental results show that our semi-supervised deep neural networks can significantly improve the screening accuracy in large-scale generative materials design.

Related papers

An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data. In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models. We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Downstream Task-Oriented Generative Model Selections on Synthetic Data Training for Fraud Detection Models [9.754400681589845]
In this paper, we approach the downstream task-oriented generative model selections problem in the case of training fraud detection models. Our investigation supports that, while both Neural Network(NN)-based and Bayesian Network(BN)-based generative models are both good to complete synthetic training task under loose model interpretability constrain, the BN-based generative models is better than NN-based when synthetic training fraud detection model under strict model interpretability constrain.
arXiv Detail & Related papers (2024-01-01T23:33:56Z)
On the Stability of Iterative Retraining of Generative Models on their own Data [56.153542044045224]
We study the impact of training generative models on mixed datasets. We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough. We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-09-30T16:41:04Z)
An Adversarial Active Sampling-based Data Augmentation Framework for Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable. Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks. We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z)
Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised Convolutional Neural Networks [25.160063477248904]
A convolutional neural network model is developed for detecting atrial fibrillation from electrocardiogram signals. The model demonstrates high performance despite being trained on limited, variable-length input data. The final model achieved a 91.1% model compression ratio while maintaining high model accuracy of 91.7% and less than 1% loss.
arXiv Detail & Related papers (2022-06-14T11:47:04Z)
EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting. We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z)
LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models. We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity. Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z)
Mitigating severe over-parameterization in deep convolutional neural networks through forced feature abstraction and compression with an entropy-based heuristic [7.503338065129185]
We propose an Entropy-Based Convolutional Layer Estimation (EBCLE) which is robust and simple. We present empirical evidence to emphasize the relative effectiveness of broader, yet shallower models trained using the EBCLE.
arXiv Detail & Related papers (2021-06-27T10:34:39Z)
On Energy-Based Models with Overparametrized Shallow Neural Networks [44.74000986284978]
Energy-based models (EBMs) are a powerful framework for generative modeling. In this work we focus on shallow neural networks. We show that models trained in the so-called "active" regime provide a statistical advantage over their associated "lazy" or kernel regime.
arXiv Detail & Related papers (2021-04-15T15:34:58Z)
ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data. The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.