Estimation of multiple mean vectors in high dimension
- URL: http://arxiv.org/abs/2403.15038v1
- Date: Fri, 22 Mar 2024 08:42:41 GMT
- Title: Estimation of multiple mean vectors in high dimension
- Authors: Gilles Blanchard, Jean-Baptiste Fermanian, Hannah Marienwald,
- Abstract summary: We endeavour to estimate numerous multi-dimensional means of various probability distributions on a common space based on independent samples.
Our approach involves forming estimators through convex combinations of empirical means derived from these samples.
- Score: 4.2466572124753
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We endeavour to estimate numerous multi-dimensional means of various probability distributions on a common space based on independent samples. Our approach involves forming estimators through convex combinations of empirical means derived from these samples. We introduce two strategies to find appropriate data-dependent convex combination weights: a first one employing a testing procedure to identify neighbouring means with low variance, which results in a closed-form plug-in formula for the weights, and a second one determining weights via minimization of an upper confidence bound on the quadratic risk.Through theoretical analysis, we evaluate the improvement in quadratic risk offered by our methods compared to the empirical means. Our analysis focuses on a dimensional asymptotics perspective, showing that our methods asymptotically approach an oracle (minimax) improvement as the effective dimension of the data increases.We demonstrate the efficacy of our methods in estimating multiple kernel mean embeddings through experiments on both simulated and real-world datasets.
Related papers
- Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Mixed Matrix Completion in Complex Survey Sampling under Heterogeneous
Missingness [6.278498348219109]
We propose a fast and scalable estimation algorithm that achieves sublinear convergence.
The proposed method is applied to analyze the National Health and Nutrition Examination Survey data.
arXiv Detail & Related papers (2024-02-06T12:26:58Z) - Empirical fits to inclusive electron-carbon scattering data obtained by deep-learning methods [0.0]
We obtain empirical fits to the electron-scattering cross sections for carbon over a broad kinematic region.
We consider two different methods of obtaining such model-independent parametrizations and the corresponding uncertainties.
arXiv Detail & Related papers (2023-12-28T17:03:17Z) - A Multivariate Unimodality Test Harnessing the Dip Statistic of Mahalanobis Distances Over Random Projections [0.18416014644193066]
We extend one-dimensional unimodality principles to multi-dimensional spaces through linear random projections and point-to-point distancing.
Our method, rooted in $alpha$-unimodality assumptions, presents a novel unimodality test named mud-pod.
Both theoretical and empirical studies confirm the efficacy of our method in unimodality assessment of multidimensional datasets.
arXiv Detail & Related papers (2023-11-28T09:11:02Z) - Inference and Sampling for Archimax Copulas [29.864637081333097]
Archimax copulas are a family of distributions endowed with a precise representation that allows simultaneous modeling of the bulk and the tails of a distribution.
We build on the representation of Archimax copulas and develop a non-parametric inference method and sampling algorithm.
We experimentally compare to state-of-the-art density modeling techniques, and the results suggest that the proposed method effectively extrapolates to the tails while scaling to higher dimensional data.
arXiv Detail & Related papers (2022-05-27T14:55:40Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Continuous Wasserstein-2 Barycenter Estimation without Minimax
Optimization [94.18714844247766]
Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport.
We present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures.
arXiv Detail & Related papers (2021-02-02T21:01:13Z) - A similarity-based Bayesian mixture-of-experts model [0.5156484100374058]
We present a new non-parametric mixture-of-experts model for multivariate regression problems.
Using a conditionally specified model, predictions for out-of-sample inputs are based on similarities to each observed data point.
Posterior inference is performed on the parameters of the mixture as well as the distance metric.
arXiv Detail & Related papers (2020-12-03T18:08:30Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.