Interpretable discovery of new semiconductors with machine learning
- URL: http://arxiv.org/abs/2101.04383v1
- Date: Tue, 12 Jan 2021 10:23:16 GMT
- Title: Interpretable discovery of new semiconductors with machine learning
- Authors: Hitarth Choubisa (1), Petar Todorovi\'c (1), Joao M. Pina (1), Darshan
H. Parmar (1), Ziliang Li (1), Oleksandr Voznyy (4), Isaac Tamblyn (2,3),
Edward Sargent (1) ((1) Department of Electrical and Computer Engineering,
University of Toronto, Toronto, ON, Canada, (2) National Research Council of
Canada, Ottawa, ON, Canada, (3) Vector Institute for Artificial Intelligence,
Toronto, ON, Canada, (4) Department of Physical and Environmental Sciences,
University of Toronto, Scarborough, ON, Canada)
- Abstract summary: We report an evolutionary algorithm powered search which uses machine-learned surrogate models trained on hybrid functional DFT data benchmarked against experimental bandgaps: Deep Adaptive Regressive Weighted Intelligent Network (DARWIN)
The strategy enables efficient search over the materials space of 10$8$ ternaries and 10$11$ quaternaries$7$ for candidates with target properties.
- Score: 10.09604500193621
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Machine learning models of materials$^{1-5}$ accelerate discovery compared to
ab initio methods: deep learning models now reproduce density functional theory
(DFT)-calculated results at one hundred thousandths of the cost of DFT$^{6}$.
To provide guidance in experimental materials synthesis, these need to be
coupled with an accurate yet effective search algorithm and training data
consistent with experimental observations. Here we report an evolutionary
algorithm powered search which uses machine-learned surrogate models trained on
high-throughput hybrid functional DFT data benchmarked against experimental
bandgaps: Deep Adaptive Regressive Weighted Intelligent Network (DARWIN). The
strategy enables efficient search over the materials space of ~10$^8$ ternaries
and 10$^{11}$ quaternaries$^{7}$ for candidates with target properties. It
provides interpretable design rules, such as our finding that the difference in
the electronegativity between the halide and B-site cation being a strong
predictor of ternary structural stability. As an example, when we seek UV
emission, DARWIN predicts K$_2$CuX$_3$ (X = Cl, Br) as a promising materials
family, based on its electronegativity difference. We synthesized and found
these materials to be stable, direct bandgap UV emitters. The approach also
allows knowledge distillation for use by humans.
Related papers
- Accelerating superconductor discovery through tempered deep learning of
the electron-phonon spectral function [0.0]
We train a deep learning model to predict the electron-phonon spectral function, $alpha2F(omega)$.
We then incorporate domain knowledge of the site-projected phonon density states to impose inductive bias into the model's node attributes and enhance predictions.
This methodological innovation decreases the MAE to 0.18, 29 K, and 28 K, respectively yielding an MAE of 2.1 K for $T_c$.
arXiv Detail & Related papers (2024-01-29T22:44:28Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Machine Learning Force Fields with Data Cost Aware Training [94.78998399180519]
Machine learning force fields (MLFF) have been proposed to accelerate molecular dynamics (MD) simulation.
Even for the most data-efficient MLFFs, reaching chemical accuracy can require hundreds of frames of force and energy labels.
We propose a multi-stage computational framework -- ASTEROID, which lowers the data cost of MLFFs by leveraging a combination of cheap inaccurate data and expensive accurate data.
arXiv Detail & Related papers (2023-06-05T04:34:54Z) - ALMERIA: Boosting pairwise molecular contrasts with scalable methods [0.0]
ALMERIA is a tool for estimating compound similarities and activity prediction based on pairwise molecular contrasts.
It has been implemented using scalable software and methods to exploit large volumes of data.
Experiments show state-of-the-art performance for molecular activity prediction.
arXiv Detail & Related papers (2023-04-28T16:27:06Z) - NeuralNEB -- Neural Networks can find Reaction Paths Fast [7.7365628406567675]
Quantum mechanical methods like Density Functional Theory (DFT) are used with great success alongside efficient search algorithms for studying kinetics of reactive systems.
Machine Learning (ML) models have turned out to be excellent emulators of small molecule DFT calculations and could possibly replace DFT in such tasks.
In this paper we train state of the art equivariant Graph Neural Network (GNN)-based models on around 10.000 elementary reactions from the Transition1x dataset.
arXiv Detail & Related papers (2022-07-20T15:29:45Z) - An Experimental Design Perspective on Model-Based Reinforcement Learning [73.37942845983417]
In practical applications of RL, it is expensive to observe state transitions from the environment.
We propose an acquisition function that quantifies how much information a state-action pair would provide about the optimal solution to a Markov decision process.
arXiv Detail & Related papers (2021-12-09T23:13:57Z) - Deep metric learning improves lab of origin prediction of genetically
engineered plasmids [63.05016513788047]
Genetic engineering attribution (GEA) is the ability to make sequence-lab associations.
We propose a method, based on metric learning, that ranks the most likely labs-of-origin.
We are able to extract key signatures in plasmid sequences for particular labs, allowing for an interpretable examination of the model's outputs.
arXiv Detail & Related papers (2021-11-24T16:29:03Z) - MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease
Association Prediction [0.4061135251278187]
We propose a novel multi-tasking convolution-based approach, which we refer to as MuCoMiD.
MuCoMiD allows automatic feature extraction while incorporating knowledge from 4 heterogeneous biological information sources.
We construct large-scale experiments on standard benchmark datasets as well as our proposed larger independent test sets and case studies.
MuCoMiD shows an improvement of at least 5% in 5-fold CV evaluation on HMDDv2.0 and HMDDv3.0 datasets and at least 49% on larger independent test sets with unseen diseases and unseen diseases over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-08T10:01:46Z) - Brain Image Synthesis with Unsupervised Multivariate Canonical
CSC$\ell_4$Net [122.8907826672382]
We propose to learn dedicated features that cross both intre- and intra-modal variations using a novel CSC$ell_4$Net.
arXiv Detail & Related papers (2021-03-22T05:19:40Z) - Active learning based generative design for the discovery of wide
bandgap materials [6.5175897155391755]
We present an active generative inverse design method that combines active learning with a deep variational autoencoder neural network and a generative adversarial deep neural network model.
The application of this method has allowed us to discover new thermodynamically stable materials with high band gap and semiconductors with specified band gap ranges.
Our experiments show that while active learning itself may sample chemically infeasible candidates, these samples help to train effective screening models for filtering out materials with desired properties from the hypothetical materials created by the generative model.
arXiv Detail & Related papers (2021-02-28T20:15:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.