Comparing Generative Models with the New Physics Learning Machine
- URL: http://arxiv.org/abs/2508.02275v1
- Date: Mon, 04 Aug 2025 10:42:52 GMT
- Title: Comparing Generative Models with the New Physics Learning Machine
- Authors: Samuele Grossi, Marco Letizia, Riccardo Torre,
- Abstract summary: In large-scale and high-dimensional regimes, machine learning offers a set of tools to push beyond the limitations of standard statistical techniques.<n>We put this claim to the test by comparing a proposal from the high-energy physics literature, the New Physics Learning Machine, to perform a classification-based two-sample test.<n>We highlight the efficiency tradeoffs of the method and the computational costs that come from adopting learning-based approaches.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rise of generative models for scientific research calls for the development of new methods to evaluate their fidelity. A natural framework for addressing this problem is two-sample hypothesis testing, namely the task of determining whether two data sets are drawn from the same distribution. In large-scale and high-dimensional regimes, machine learning offers a set of tools to push beyond the limitations of standard statistical techniques. In this work, we put this claim to the test by comparing a recent proposal from the high-energy physics literature, the New Physics Learning Machine, to perform a classification-based two-sample test against a number of alternative approaches, following the framework presented in Grossi et al. (2025). We highlight the efficiency tradeoffs of the method and the computational costs that come from adopting learning-based approaches. Finally, we discuss the advantages of the different methods for different use cases.
Related papers
- Enhancing binary classification: A new stacking method via leveraging computational geometry [5.906199156511947]
This paper introduces a novel approach that integrates computational geometry techniques, specifically solving the maximum weighted rectangle problem, to develop a new meta-model for binary classification.
Our method is evaluated on multiple open datasets, with statistical analysis showing its stability and demonstrating improvements in accuracy.
Our method is highly applicable not only in stacking ensemble learning but also in various real-world applications, such as hospital health evaluation scoring and bank credit scoring systems.
arXiv Detail & Related papers (2024-10-30T06:11:08Z) - Globally-Optimal Greedy Experiment Selection for Active Sequential
Estimation [1.1530723302736279]
We study the problem of active sequential estimation, which involves adaptively selecting experiments for sequentially collected data.
The goal is to design experiment selection rules for more accurate model estimation.
We propose a class of greedy experiment selection methods and provide statistical analysis for the maximum likelihood.
arXiv Detail & Related papers (2024-02-13T17:09:29Z) - A step towards the integration of machine learning and classic model-based survey methods [0.0]
The usage of machine learning methods in traditional surveys is still very limited.<n>We propose a predictor supported by these algorithms, which can be used to predict any population or subpopulation characteristics.
arXiv Detail & Related papers (2024-02-12T09:43:17Z) - Dual Student Networks for Data-Free Model Stealing [79.67498803845059]
Two main challenges are estimating gradients of the target model without access to its parameters, and generating a diverse set of training samples.
We propose a Dual Student method where two students are symmetrically trained in order to provide the generator a criterion to generate samples that the two students disagree on.
We show that our new optimization framework provides more accurate gradient estimation of the target model and better accuracies on benchmark classification datasets.
arXiv Detail & Related papers (2023-09-18T18:11:31Z) - Learning new physics efficiently with nonparametric methods [11.970219534238444]
We present a machine learning approach for model-independent new physics searches.
The corresponding algorithm is powered by recent large-scale implementations of kernel methods.
We show that our approach has dramatic advantages compared to neural network implementations in terms of training times and computational resources.
arXiv Detail & Related papers (2022-04-05T16:17:59Z) - A Typology for Exploring the Mitigation of Shortcut Behavior [29.38025128165229]
We provide a unification of various XIL methods into a single typology by establishing a common set of basic modules.
In our evaluations, all methods prove to revise a model successfully.
However, we found remarkable differences in individual benchmark tasks, revealing valuable application-relevant aspects.
arXiv Detail & Related papers (2022-03-04T14:16:50Z) - Team Cogitat at NeurIPS 2021: Benchmarks for EEG Transfer Learning
Competition [55.34407717373643]
Building subject-independent deep learning models for EEG decoding faces the challenge of strong co-shift.
Our approach is to explicitly align feature distributions at various layers of the deep learning model.
The methodology won first place in the 2021 Benchmarks in EEG Transfer Learning competition, hosted at the NeurIPS conference.
arXiv Detail & Related papers (2022-02-01T11:11:08Z) - Predictive machine learning for prescriptive applications: a coupled
training-validating approach [77.34726150561087]
We propose a new method for training predictive machine learning models for prescriptive applications.
This approach is based on tweaking the validation step in the standard training-validating-testing scheme.
Several experiments with synthetic data demonstrate promising results in reducing the prescription costs in both deterministic and real models.
arXiv Detail & Related papers (2021-10-22T15:03:20Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Learning the Truth From Only One Side of the Story [58.65439277460011]
We focus on generalized linear models and show that without adjusting for this sampling bias, the model may converge suboptimally or even fail to converge to the optimal solution.
We propose an adaptive approach that comes with theoretical guarantees and show that it outperforms several existing methods empirically.
arXiv Detail & Related papers (2020-06-08T18:20:28Z) - Real-Time Model Calibration with Deep Reinforcement Learning [4.707841918805165]
We propose a novel framework for inference of model parameters based on reinforcement learning.
The proposed methodology is demonstrated and evaluated on two model-based diagnostics test cases.
arXiv Detail & Related papers (2020-06-07T00:11:42Z) - Marginal likelihood computation for model selection and hypothesis
testing: an extensive review [66.37504201165159]
This article provides a comprehensive study of the state-of-the-art of the topic.
We highlight limitations, benefits, connections and differences among the different techniques.
Problems and possible solutions with the use of improper priors are also described.
arXiv Detail & Related papers (2020-05-17T18:31:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.