Multiple Support Recovery Using Very Few Measurements Per Sample
- URL: http://arxiv.org/abs/2105.09855v1
- Date: Thu, 20 May 2021 16:02:27 GMT
- Title: Multiple Support Recovery Using Very Few Measurements Per Sample
- Authors: Lekshmi Ramesh, Chandra R. Murthy, Himanshu Tyagi
- Abstract summary: We are given access to linear measurements of multiple sparse samples in $mathbbRd$.
These samples can be partitioned into $ell$ groups, with samples having the same support belonging to the same group.
For a given budget of $m$ measurements per sample, the goal is to recover the $ell$ underlying supports.
- Score: 33.20967869233738
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the problem of multiple support recovery, we are given access to linear
measurements of multiple sparse samples in $\mathbb{R}^{d}$. These samples can
be partitioned into $\ell$ groups, with samples having the same support
belonging to the same group. For a given budget of $m$ measurements per sample,
the goal is to recover the $\ell$ underlying supports, in the absence of the
knowledge of group labels. We study this problem with a focus on the
measurement-constrained regime where $m$ is smaller than the support size $k$
of each sample. We design a two-step procedure that estimates the union of the
underlying supports first, and then uses a spectral algorithm to estimate the
individual supports. Our proposed estimator can recover the supports with $m<k$
measurements per sample, from $\tilde{O}(k^{4}\ell^{4}/m^{4})$ samples. Our
guarantees hold for a general, generative model assumption on the samples and
measurement matrices. We also provide results from experiments conducted on
synthetic data and on the MNIST dataset.
Related papers
- Thinning a Wishart Random Matrix [3.734088413551237]
We show that it is possible to generate two independent data matrices with independent $N_p(mu, Sigma)$ rows.
These independent data matrices can either be used directly within a train-test paradigm, or can be used to derive independent summary statistics.
arXiv Detail & Related papers (2025-02-14T07:34:38Z) - Large Language Monkeys: Scaling Inference Compute with Repeated Sampling [81.34900892130929]
We explore inference compute as another axis for scaling, using the simple technique of repeatedly sampling candidate solutions from a model.
Across multiple tasks and models, we observe that coverage scales with the number of samples over four orders of magnitude.
In domains like coding and formal proofs, where answers can be automatically verified, these increases in coverage directly translate into improved performance.
arXiv Detail & Related papers (2024-07-31T17:57:25Z) - GROS: A General Robust Aggregation Strategy [49.1574468325115]
A new, very general, robust procedure for combining estimators in metric spaces is introduced.
We show that the same (up to a constant) sub-Gaussianity is obtained if the minimization is taken over the sample.
The performance of GROS is evaluated through five simulation studies.
arXiv Detail & Related papers (2024-02-23T17:00:32Z) - Testing with Non-identically Distributed Samples [20.74768558932617]
We examine the extent to which sublinear-sample property testing and estimation applies to settings where samples are independently but not identically distributed.
Even with $Theta(k/varepsilon2)$ samples from each distribution, $textbfp_mathrmavg$ is sufficient to learn $textbfp_mathrmavg$ to within error $varepsilon$ in TV distance.
arXiv Detail & Related papers (2023-11-19T01:25:50Z) - Improved Active Learning via Dependent Leverage Score Sampling [8.400581768343804]
We show how to obtain improved active learning methods in the agnostic (adversarial noise) setting.
We propose an easily implemented method based on the emphpivotal sampling algorithm
In comparison to independent sampling, our method reduces the number of samples needed to reach a given target accuracy by up to $50%$.
arXiv Detail & Related papers (2023-10-08T01:51:30Z) - Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond [89.72693227960274]
This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions.
To reduce the number of samples in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts and the other executes an online algorithm for non-oblivious multi-armed bandits.
In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of distributions.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Bias Mimicking: A Simple Sampling Approach for Bias Mitigation [57.17709477668213]
We introduce a new class-conditioned sampling method: Bias Mimicking.
Bias Mimicking improves underrepresented groups' accuracy of sampling methods by 3% over four benchmarks.
arXiv Detail & Related papers (2022-09-30T17:33:00Z) - Meta Sparse Principal Component Analysis [31.403997435274604]
We study the meta-learning for support (i.e. the set of non-zero entries) recovery in high-dimensional Principal Component Analysis.
We reduce the sufficient sample complexity in a novel task with the information that is learned from auxiliary tasks.
arXiv Detail & Related papers (2022-08-18T16:28:31Z) - Meta Learning for Support Recovery in High-dimensional Precision Matrix
Estimation [31.41834200409171]
We study meta learning for support (i.e., the set of non-zero entries) recovery in high-dimensional precision matrix estimation.
In our setup, each task has a different random true precision matrix, each with a possibly different support.
We prove a matching information-theoretic lower bound for the necessary number of samples, which is $n in Omega(log N)/K)$, and thus, our algorithm is minimax optimal.
arXiv Detail & Related papers (2020-06-22T20:24:52Z) - A Randomized Algorithm to Reduce the Support of Discrete Measures [79.55586575988292]
Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms.
We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling"
We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $Ngg n$ regime.
arXiv Detail & Related papers (2020-06-02T16:38:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.