Characterizing out-of-distribution generalization of neural networks: application to the disordered Su-Schrieffer-Heeger model
- URL: http://arxiv.org/abs/2406.10012v1
- Date: Fri, 14 Jun 2024 13:24:32 GMT
- Title: Characterizing out-of-distribution generalization of neural networks: application to the disordered Su-Schrieffer-Heeger model
- Authors: Kacper Cybiński, Marcin Płodzień, Michał Tomza, Maciej Lewenstein, Alexandre Dauphin, Anna Dawid,
- Abstract summary: We show how interpretability methods can increase trust in predictions of a neural network trained to classify quantum phases.
In particular, we show that we can ensure better out-of-distribution generalization in the complex classification problem.
This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.
- Score: 38.79241114146971
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) is a promising tool for the detection of phases of matter. However, ML models are also known for their black-box construction, which hinders understanding of what they learn from the data and makes their application to novel data risky. Moreover, the central challenge of ML is to ensure its good generalization abilities, i.e., good performance on data outside the training set. Here, we show how the informed use of an interpretability method called class activation mapping (CAM), and the analysis of the latent representation of the data with the principal component analysis (PCA) can increase trust in predictions of a neural network (NN) trained to classify quantum phases. In particular, we show that we can ensure better out-of-distribution generalization in the complex classification problem by choosing such an NN that, in the simplified version of the problem, learns a known characteristic of the phase. We show this on an example of the topological Su-Schrieffer-Heeger (SSH) model with and without disorder, which turned out to be surprisingly challenging for NNs trained in a supervised way. This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.
Related papers
- Problem-informed Graphical Quantum Generative Learning [0.3914676152740143]
We propose a problem-informed quantum circuit Born machine Ansatz for learning the joint probability distribution of random variables.
We compare our model's performance to previous designs, showing it outperforms problem-agnostic circuits.
arXiv Detail & Related papers (2024-05-23T00:29:35Z) - ECATS: Explainable-by-design concept-based anomaly detection for time series [0.5956301166481089]
We propose ECATS, a concept-based neuro-symbolic architecture where concepts are represented as Signal Temporal Logic (STL) formulae.
We show that our model is able to achieve great classification performance while ensuring local interpretability.
arXiv Detail & Related papers (2024-05-17T08:12:53Z) - Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data [0.0]
We introduce fully Bayesian Neural Networks (FBNNs) for active learning tasks in the'small data' regime.
FBNNs provide reliable predictive distributions, crucial for making informed decisions under uncertainty in the active learning setting.
Here, we assess the suitability and performance of FBNNs with the No-U-Turn Sampler for active learning tasks in the'small data' regime.
arXiv Detail & Related papers (2024-05-16T05:20:47Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Look beyond labels: Incorporating functional summary information in
Bayesian neural networks [11.874130244353253]
We present a simple approach to incorporate summary information about the predicted probability.
The available summary information is incorporated as augmented data and modeled with a Dirichlet process.
We show how the method can inform the model about task difficulty or class imbalance.
arXiv Detail & Related papers (2022-07-04T07:06:45Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - NSL: Hybrid Interpretable Learning From Noisy Raw Data [66.15862011405882]
This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data.
NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics.
We demonstrate that NSL is able to learn robust rules from MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines.
arXiv Detail & Related papers (2020-12-09T13:02:44Z) - Enhancing Mixup-based Semi-Supervised Learning with Explicit Lipschitz
Regularization [5.848916882288327]
Semi-supervised learning (SSL) mitigates the challenge by exploiting the behavior of the neural function on large unlabeled data.
A successful example is the adoption of mixup strategy in SSL that enforces the global smoothness of the neural function.
We propose that mixup improves the smoothness of the neural function by bounding the Lipschitz constant of the gradient function of the neural networks.
arXiv Detail & Related papers (2020-09-23T23:19:19Z) - Phase Detection with Neural Networks: Interpreting the Black Box [58.720142291102135]
Neural networks (NNs) usually hinder any insight into the reasoning behind their predictions.
We demonstrate how influence functions can unravel the black box of NN when trained to predict the phases of the one-dimensional extended spinless Fermi-Hubbard model at half-filling.
arXiv Detail & Related papers (2020-04-09T17:45:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.