Learning from Summarized Data: Gaussian Process Regression with Sample Quasi-Likelihood
- URL: http://arxiv.org/abs/2412.17455v2
- Date: Mon, 10 Feb 2025 23:30:12 GMT
- Title: Learning from Summarized Data: Gaussian Process Regression with Sample Quasi-Likelihood
- Authors: Yuta Shikuri,
- Abstract summary: This study tackles learning and inference using only summarized data within the framework of Gaussian process regression.
We introduce the concept of sample quasi-likelihood, which facilitates learning and inference using only summarized data.
- Score: 0.0
- License:
- Abstract: Gaussian process regression is a powerful Bayesian nonlinear regression method. Recent research has enabled the capture of many types of observations using non-Gaussian likelihoods. To deal with various tasks in spatial modeling, we benefit from this development. Difficulties still arise when we can only access summarized data consisting of representative features, summary statistics, and data point counts. Such situations frequently occur primarily due to concerns about confidentiality and management costs associated with spatial data. This study tackles learning and inference using only summarized data within the framework of Gaussian process regression. To address this challenge, we analyze the approximation errors in the marginal likelihood and posterior distribution that arise from utilizing representative features. We also introduce the concept of sample quasi-likelihood, which facilitates learning and inference using only summarized data. Non-Gaussian likelihoods satisfying certain assumptions can be captured by specifying a variance function that characterizes a sample quasi-likelihood function. Theoretical and experimental results demonstrate that the approximation performance is influenced by the granularity of summarized data relative to the length scale of covariance functions. Experiments on a real-world dataset highlight the practicality of our method for spatial modeling.
Related papers
- Task Shift: From Classification to Regression in Overparameterized Linear Models [5.030445392527011]
We investigate a phenomenon where latent knowledge is transferred to a more difficult task under a similar data distribution.
We show that while minimum-norm interpolators for classification cannot transfer to regression a priori, they experience surprisingly structured attenuation which enables successful task shift with limited additional data.
Our results show that while minimum-norm interpolators for classification cannot transfer to regression a priori, they experience surprisingly structured attenuation which enables successful task shift with limited additional data.
arXiv Detail & Related papers (2025-02-18T21:16:01Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - Engression: Extrapolation through the Lens of Distributional Regression [2.519266955671697]
We propose a neural network-based distributional regression methodology called engression'
An engression model is generative in the sense that we can sample from the fitted conditional distribution and is also suitable for high-dimensional outcomes.
We show that engression can successfully perform extrapolation under some assumptions such as monotonicity, whereas traditional regression approaches such as least-squares or quantile regression fall short under the same assumptions.
arXiv Detail & Related papers (2023-07-03T08:19:00Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Efficient Multidimensional Functional Data Analysis Using Marginal
Product Basis Systems [2.4554686192257424]
We propose a framework for learning continuous representations from a sample of multidimensional functional data.
We show that the resulting estimation problem can be solved efficiently by the tensor decomposition.
We conclude with a real data application in neuroimaging.
arXiv Detail & Related papers (2021-07-30T16:02:15Z) - Generalized Multi-Output Gaussian Process Censored Regression [7.111443975103331]
We introduce a heteroscedastic multi-output Gaussian process model which combines the non-parametric flexibility of GPs with the ability to leverage information from correlated outputs under input-dependent noise conditions.
Results show how the added flexibility allows our model to better estimate the underlying non-censored (i.e. true) process under potentially complex censoring dynamics.
arXiv Detail & Related papers (2020-09-10T12:46:29Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z) - Linear predictor on linearly-generated data with missing values: non
consistency and solutions [0.0]
We study the seemingly-simple case where the target to predict is a linear function of the fully-observed data.
We show that, in the presence of missing values, the optimal predictor may not be linear.
arXiv Detail & Related papers (2020-02-03T11:49:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.