Related papers: Interpretable discovery of new semiconductors with machine learning

Interpretable discovery of new semiconductors with machine learning

URL: http://arxiv.org/abs/2101.04383v1
Date: Tue, 12 Jan 2021 10:23:16 GMT
Title: Interpretable discovery of new semiconductors with machine learning
Authors: Hitarth Choubisa (1), Petar Todorovi\'c (1), Joao M. Pina (1), Darshan H. Parmar (1), Ziliang Li (1), Oleksandr Voznyy (4), Isaac Tamblyn (2,3), Edward Sargent (1) ((1) Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada, (2) National Research Council of Canada, Ottawa, ON, Canada, (3) Vector Institute for Artificial Intelligence, Toronto, ON, Canada, (4) Department of Physical and Environmental Sciences, University of Toronto, Scarborough, ON, Canada)
Abstract summary: We report an evolutionary algorithm powered search which uses machine-learned surrogate models trained on hybrid functional DFT data benchmarked against experimental bandgaps: Deep Adaptive Regressive Weighted Intelligent Network (DARWIN) The strategy enables efficient search over the materials space of 10$8$ ternaries and 10$11$ quaternaries$7$ for candidates with target properties.
Score: 10.09604500193621
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Machine learning models of materials$^{1-5}$ accelerate discovery compared to ab initio methods: deep learning models now reproduce density functional theory (DFT)-calculated results at one hundred thousandths of the cost of DFT$^{6}$. To provide guidance in experimental materials synthesis, these need to be coupled with an accurate yet effective search algorithm and training data consistent with experimental observations. Here we report an evolutionary algorithm powered search which uses machine-learned surrogate models trained on high-throughput hybrid functional DFT data benchmarked against experimental bandgaps: Deep Adaptive Regressive Weighted Intelligent Network (DARWIN). The strategy enables efficient search over the materials space of ~10$^8$ ternaries and 10$^{11}$ quaternaries$^{7}$ for candidates with target properties. It provides interpretable design rules, such as our finding that the difference in the electronegativity between the halide and B-site cation being a strong predictor of ternary structural stability. As an example, when we seek UV emission, DARWIN predicts K$_2$CuX$_3$ (X = Cl, Br) as a promising materials family, based on its electronegativity difference. We synthesized and found these materials to be stable, direct bandgap UV emitters. The approach also allows knowledge distillation for use by humans.

Related papers

Accelerating superconductor discovery through tempered deep learning of the electron-phonon spectral function [0.0]
We train a deep learning model to predict the electron-phonon spectral function, $alpha2F(omega)$. We then incorporate domain knowledge of the site-projected phonon density states to impose inductive bias into the model's node attributes and enhance predictions. This methodological innovation decreases the MAE to 0.18, 29 K, and 28 K, respectively yielding an MAE of 2.1 K for $T_c$.
arXiv Detail & Related papers (2024-01-29T22:44:28Z)
A Multi-Grained Symmetric Differential Equation Model for Learning Protein-Ligand Binding Dynamics [73.35846234413611]
In drug discovery, molecular dynamics (MD) simulation provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites. We propose NeuralMD, the first machine learning (ML) surrogate that can facilitate numerical MD and provide accurate simulations in protein-ligand binding dynamics. We demonstrate the efficiency and effectiveness of NeuralMD, achieving over 1K$times$ speedup compared to standard numerical MD simulations.
arXiv Detail & Related papers (2024-01-26T09:35:17Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Robust Learning with Progressive Data Expansion Against Spurious Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z)
Machine Learning Force Fields with Data Cost Aware Training [94.78998399180519]
Machine learning force fields (MLFF) have been proposed to accelerate molecular dynamics (MD) simulation. Even for the most data-efficient MLFFs, reaching chemical accuracy can require hundreds of frames of force and energy labels. We propose a multi-stage computational framework -- ASTEROID, which lowers the data cost of MLFFs by leveraging a combination of cheap inaccurate data and expensive accurate data.
arXiv Detail & Related papers (2023-06-05T04:34:54Z)
ALMERIA: Boosting pairwise molecular contrasts with scalable methods [0.0]
ALMERIA is a tool for estimating compound similarities and activity prediction based on pairwise molecular contrasts. It has been implemented using scalable software and methods to exploit large volumes of data. Experiments show state-of-the-art performance for molecular activity prediction.
arXiv Detail & Related papers (2023-04-28T16:27:06Z)
NeuralNEB -- Neural Networks can find Reaction Paths Fast [7.7365628406567675]
Quantum mechanical methods like Density Functional Theory (DFT) are used with great success alongside efficient search algorithms for studying kinetics of reactive systems. Machine Learning (ML) models have turned out to be excellent emulators of small molecule DFT calculations and could possibly replace DFT in such tasks. In this paper we train state of the art equivariant Graph Neural Network (GNN)-based models on around 10.000 elementary reactions from the Transition1x dataset.
arXiv Detail & Related papers (2022-07-20T15:29:45Z)
Deep metric learning improves lab of origin prediction of genetically engineered plasmids [63.05016513788047]
Genetic engineering attribution (GEA) is the ability to make sequence-lab associations. We propose a method, based on metric learning, that ranks the most likely labs-of-origin. We are able to extract key signatures in plasmid sequences for particular labs, allowing for an interpretable examination of the model's outputs.
arXiv Detail & Related papers (2021-11-24T16:29:03Z)
MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease Association Prediction [0.4061135251278187]
We propose a novel multi-tasking convolution-based approach, which we refer to as MuCoMiD. MuCoMiD allows automatic feature extraction while incorporating knowledge from 4 heterogeneous biological information sources. We construct large-scale experiments on standard benchmark datasets as well as our proposed larger independent test sets and case studies. MuCoMiD shows an improvement of at least 5% in 5-fold CV evaluation on HMDDv2.0 and HMDDv3.0 datasets and at least 49% on larger independent test sets with unseen diseases and unseen diseases over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-08T10:01:46Z)
Brain Image Synthesis with Unsupervised Multivariate Canonical CSC$\ell_4$Net [122.8907826672382]
We propose to learn dedicated features that cross both intre- and intra-modal variations using a novel CSC$ell_4$Net.
arXiv Detail & Related papers (2021-03-22T05:19:40Z)
Active learning based generative design for the discovery of wide bandgap materials [6.5175897155391755]
We present an active generative inverse design method that combines active learning with a deep variational autoencoder neural network and a generative adversarial deep neural network model. The application of this method has allowed us to discover new thermodynamically stable materials with high band gap and semiconductors with specified band gap ranges. Our experiments show that while active learning itself may sample chemically infeasible candidates, these samples help to train effective screening models for filtering out materials with desired properties from the hypothetical materials created by the generative model.
arXiv Detail & Related papers (2021-02-28T20:15:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.