Data-driven Approach for Interpolation of Sparse Data
- URL: http://arxiv.org/abs/2505.01473v1
- Date: Fri, 02 May 2025 13:17:45 GMT
- Title: Data-driven Approach for Interpolation of Sparse Data
- Authors: R. F. Ferguson, D. G. Ireland, B. McKinnon,
- Abstract summary: Studies of hadron resonances and their properties are limited by the accuracy and consistency of measured datasets.<n>We have used Gaussian Processes (GP) to build interpolated datasets, including quantification of uncertainties.<n>GP provides a robust, model-independent method for interpolating typical datasets used in hadron resonance studies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Studies of hadron resonances and their properties are limited by the accuracy and consistency of measured datasets, which can originate from many different experiments. We have used Gaussian Processes (GP) to build interpolated datasets, including quantification of uncertainties, so that data from different sources can be used in model fitting without the need for arbitrary weighting. GPs predict values and uncertainties of observables at any kinematic point. Bayesian inference is used to optimise the hyperparameters of the GP model. We demonstrate that the GP successfully interpolates data with quantified uncertainties by comparison with generated pseudodata. We also show that this methodology can be used to investigate the consistency of data from different sources. GPs provide a robust, model-independent method for interpolating typical datasets used in hadron resonance studies, removing the limitations of arbitrary weighting in sparse datasets.
Related papers
- Compactly-supported nonstationary kernels for computing exact Gaussian processes on big data [2.8377382540923004]
The Gaussian process (GP) is a widely used machine learning method with implicit uncertainty characterization.<n>Traditional implementations of GPs involve stationary kernels that limit their flexibility.<n>We derive an alternative kernel that can discover and encode both sparsity and nonstationarity.
arXiv Detail & Related papers (2024-11-07T20:07:21Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Biases in Inverse Ising Estimates of Near-Critical Behaviour [0.0]
Inverse inference allows pairwise interactions to be reconstructed from empirical correlations.
We show that estimators used for this inference, such as Pseudo-likelihood (PLM), are biased.
Data-driven methods are explored and applied to a functional magnetic resonance imaging (fMRI) dataset from neuroscience.
arXiv Detail & Related papers (2023-01-13T14:01:43Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Aggregated Multi-output Gaussian Processes with Knowledge Transfer
Across Domains [39.25639417233822]
This article offers a multi-output Gaussian process (MoGP) model that infers functions for attributes using multiple aggregate datasets of respective granularities.
Experiments demonstrate that the proposed model outperforms in the task of refining coarse-grained aggregate data on real-world datasets.
arXiv Detail & Related papers (2022-06-24T08:07:20Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Multimodal Data Fusion in High-Dimensional Heterogeneous Datasets via
Generative Models [16.436293069942312]
We are interested in learning probabilistic generative models from high-dimensional heterogeneous data in an unsupervised fashion.
We propose a general framework that combines disparate data types through the exponential family of distributions.
The proposed algorithm is presented in detail for the commonly encountered heterogeneous datasets with real-valued (Gaussian) and categorical (multinomial) features.
arXiv Detail & Related papers (2021-08-27T18:10:31Z) - Learning to discover: expressive Gaussian mixture models for
multi-dimensional simulation and parameter inference in the physical sciences [0.0]
We show that density models describing multiple observables may be created using an auto-regressive Gaussian mixture model.
The model is designed to capture how observable spectra are deformed by hypothesis variations.
It may be used as a statistical model for scientific discovery in interpreting experimental observations.
arXiv Detail & Related papers (2021-08-25T21:27:29Z) - Post-mortem on a deep learning contest: a Simpson's paradox and the
complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models.
We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data.
We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.