Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction
- URL: http://arxiv.org/abs/2403.09560v2
- Date: Wed, 5 Jun 2024 07:46:36 GMT
- Title: Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction
- Authors: He Zhang, Chang Liu, Zun Wang, Xinran Wei, Siyuan Liu, Nanning Zheng, Bin Shao, Tie-Yan Liu,
- Abstract summary: We show that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training.
It enables the model to be trained on a large amount of unlabeled data, hence addresses the data scarcity challenge.
It is more efficient than running DFT to generate labels for supervised training, since it amortizes DFT calculation over a set of queries.
- Score: 74.84850523400873
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predicting the mean-field Hamiltonian matrix in density functional theory is a fundamental formulation to leverage machine learning for solving molecular science problems. Yet, its applicability is limited by insufficient labeled data for training. In this work, we highlight that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training, an exact training method that does not require labeled data. It distinguishes the task from predicting other molecular properties by the following benefits: (1) it enables the model to be trained on a large amount of unlabeled data, hence addresses the data scarcity challenge and enhances generalization; (2) it is more efficient than running DFT to generate labels for supervised training, since it amortizes DFT calculation over a set of queries. We empirically demonstrate the better generalization in data-scarce and out-of-distribution scenarios, and the better efficiency over DFT labeling. These benefits push forward the applicability of Hamiltonian prediction to an ever-larger scale.
Related papers
- What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy.
By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z) - Towards a Theoretical Understanding of Memorization in Diffusion Models [76.85077961718875]
Diffusion probabilistic models (DPMs) are being employed as mainstream models for Generative Artificial Intelligence (GenAI)
We provide a theoretical understanding of memorization in both conditional and unconditional DPMs under the assumption of model convergence.
We propose a novel data extraction method named textbfSurrogate condItional Data Extraction (SIDE) that leverages a time-dependent classifier trained on the generated data as a surrogate condition to extract training data from unconditional DPMs.
arXiv Detail & Related papers (2024-10-03T13:17:06Z) - Extracting Training Data from Unconditional Diffusion Models [76.85077961718875]
diffusion probabilistic models (DPMs) are being employed as mainstream models for generative artificial intelligence (AI)
We aim to establish a theoretical understanding of memorization in DPMs with 1) a memorization metric for theoretical analysis, 2) an analysis of conditional memorization with informative and random labels, and 3) two better evaluation metrics for measuring memorization.
Based on the theoretical analysis, we propose a novel data extraction method called textbfSurrogate condItional Data Extraction (SIDE) that leverages a trained on generated data as a surrogate condition to extract training data directly from unconditional diffusion models.
arXiv Detail & Related papers (2024-06-18T16:20:12Z) - Efficient and Generalizable Certified Unlearning: A Hessian-free Recollection Approach [8.875278412741695]
Machine unlearning strives to uphold the data owners' right to be forgotten by enabling models to selectively forget specific data.
We develop an algorithm that achieves near-instantaneous unlearning as it only requires a vector addition operation.
arXiv Detail & Related papers (2024-04-02T07:54:18Z) - Machine Learning Force Fields with Data Cost Aware Training [94.78998399180519]
Machine learning force fields (MLFF) have been proposed to accelerate molecular dynamics (MD) simulation.
Even for the most data-efficient MLFFs, reaching chemical accuracy can require hundreds of frames of force and energy labels.
We propose a multi-stage computational framework -- ASTEROID, which lowers the data cost of MLFFs by leveraging a combination of cheap inaccurate data and expensive accurate data.
arXiv Detail & Related papers (2023-06-05T04:34:54Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Deep Active Learning for Biased Datasets via Fisher Kernel
Self-Supervision [5.352699766206807]
Active learning (AL) aims to minimize labeling efforts for data-demanding deep neural networks (DNNs)
We propose a low-complexity method for feature density matching using self-supervised Fisher kernel (FK)
Our method outperforms state-of-the-art methods on MNIST, SVHN, and ImageNet classification while requiring only 1/10th of processing.
arXiv Detail & Related papers (2020-03-01T03:56:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.