Towards Novel Insights in Lattice Field Theory with Explainable Machine
Learning
- URL: http://arxiv.org/abs/2003.01504v2
- Date: Mon, 18 May 2020 14:04:18 GMT
- Title: Towards Novel Insights in Lattice Field Theory with Explainable Machine
Learning
- Authors: Stefan Bluecher, Lukas Kades, Jan M. Pawlowski, Nils Strodthoff,
Julian M. Urban
- Abstract summary: We propose representation learning in combination with interpretability methods as a framework for the identification of observables.
The approach is put to work in the context of a scalar Yukawa model in (2+1)d.
Based on our results, we argue that due to its broad applicability, attribution methods such as LRP could prove a useful and versatile tool in our search for new physical insights.
- Score: 1.5854412882298003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning has the potential to aid our understanding of phase
structures in lattice quantum field theories through the statistical analysis
of Monte Carlo samples. Available algorithms, in particular those based on deep
learning, often demonstrate remarkable performance in the search for previously
unidentified features, but tend to lack transparency if applied naively. To
address these shortcomings, we propose representation learning in combination
with interpretability methods as a framework for the identification of
observables. More specifically, we investigate action parameter regression as a
pretext task while using layer-wise relevance propagation (LRP) to identify the
most important observables depending on the location in the phase diagram. The
approach is put to work in the context of a scalar Yukawa model in (2+1)d.
First, we investigate a multilayer perceptron to determine an importance
hierarchy of several predefined, standard observables. The method is then
applied directly to the raw field configurations using a convolutional network,
demonstrating the ability to reconstruct all order parameters from the learned
filter weights. Based on our results, we argue that due to its broad
applicability, attribution methods such as LRP could prove a useful and
versatile tool in our search for new physical insights. In the case of the
Yukawa model, it facilitates the construction of an observable that
characterises the symmetric phase.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Cross-Entropy Is All You Need To Invert the Data Generating Process [29.94396019742267]
Empirical phenomena suggest that supervised models can learn interpretable factors of variation in a linear fashion.
Recent advances in self-supervised learning have shown that these methods can recover latent structures by inverting the data generating process.
We prove that even in standard classification tasks, models learn representations of ground-truth factors of variation up to a linear transformation.
arXiv Detail & Related papers (2024-10-29T09:03:57Z) - Characterizing out-of-distribution generalization of neural networks: application to the disordered Su-Schrieffer-Heeger model [38.79241114146971]
We show how interpretability methods can increase trust in predictions of a neural network trained to classify quantum phases.
In particular, we show that we can ensure better out-of-distribution generalization in the complex classification problem.
This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.
arXiv Detail & Related papers (2024-06-14T13:24:32Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Beyond Cuts in Small Signal Scenarios -- Enhanced Sneutrino
Detectability Using Machine Learning [0.0]
We use two different models, XGBoost and a deep neural network, to exploit correlations between observables.
We consider different methods to analyze the models' output, finding that a template fit generally performs better than a simple cut.
arXiv Detail & Related papers (2021-08-06T13:48:19Z) - Prequential MDL for Causal Structure Learning with Neural Networks [9.669269791955012]
We show that the prequential minimum description length principle can be used to derive a practical scoring function for Bayesian networks.
We obtain plausible and parsimonious graph structures without relying on sparsity inducing priors or other regularizers which must be tuned.
We discuss how the the prequential score relates to recent work that infers causal structure from the speed of adaptation when the observations come from a source undergoing distributional shift.
arXiv Detail & Related papers (2021-07-02T22:35:21Z) - Transforming Feature Space to Interpret Machine Learning Models [91.62936410696409]
This contribution proposes a novel approach that interprets machine-learning models through the lens of feature space transformations.
It can be used to enhance unconditional as well as conditional post-hoc diagnostic tools.
A case study on remote-sensing landcover classification with 46 features is used to demonstrate the potential of the proposed approach.
arXiv Detail & Related papers (2021-04-09T10:48:11Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms.
We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.