Related papers: Distributed Reconstruction of Noisy Pooled Data

Distributed Reconstruction of Noisy Pooled Data

URL: http://arxiv.org/abs/2204.07491v1
Date: Thu, 14 Apr 2022 06:48:40 GMT
Title: Distributed Reconstruction of Noisy Pooled Data
Authors: Max Hahn-Klimroth and Dominik Kaaser
Abstract summary: We consider two noise models for the pooled data problem. We present and analyze for both error models a simple and efficient distributed algorithm. Our novel analysis pins down the range of error probabilities and distributions for which our algorithm reconstructs the exact initial states with high probability.
Score: 2.0559497209595823
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the pooled data problem we are given a set of $n$ agents, each of which holds a hidden state bit, either $0$ or $1$. A querying procedure returns for a query set the sum of the states of the queried agents. The goal is to reconstruct the states using as few queries as possible. In this paper we consider two noise models for the pooled data problem. In the noisy channel model, the result for each agent flips with a certain probability. In the noisy query model, each query result is subject to random Gaussian noise. Our results are twofold. First, we present and analyze for both error models a simple and efficient distributed algorithm that reconstructs the initial states in a greedy fashion. Our novel analysis pins down the range of error probabilities and distributions for which our algorithm reconstructs the exact initial states with high probability. Secondly, we present simulation results of our algorithm and compare its performance with approximate message passing (AMP) algorithms that are conjectured to be optimal in a number of related problems.

Related papers

Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models. We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Evolutionary Algorithms Are Significantly More Robust to Noise When They Ignore It [10.165640083594573]
We show the need for re-evaluations could be overestimated, and in fact, detrimental. This first analysis of an evolutionary algorithm solving a single-objective noisy problem without re-evaluations could indicate that such algorithms cope with noise much better than previously thought.
arXiv Detail & Related papers (2024-08-31T00:10:10Z)
Perturb-and-Project: Differentially Private Similarities and Marginals [73.98880839337873]
We revisit the input perturbations framework for differential privacy where noise is added to the input $Ain mathcalS$. We first design novel efficient algorithms to privately release pair-wise cosine similarities. We derive a novel algorithm to compute $k$-way marginal queries over $n$ features.
arXiv Detail & Related papers (2024-06-07T12:07:16Z)
Efficient Approximate Recovery from Pooled Data Using Doubly Regular Pooling Schemes [1.7403133838762448]
We analyze an approximate reconstruction algorithm that estimates the hidden bits in a greedy fashion. Our analysis is uniform in the degree of noise and the sparsity of $sigma$.
arXiv Detail & Related papers (2023-02-28T19:31:40Z)
Learning to Bound Counterfactual Inference in Structural Causal Models from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm. The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources. It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z)
Higher degree sum-of-squares relaxations robust against oblivious outliers [14.58610686291355]
We consider estimation models of the form $Y=X*+N$, where $X*$ is some $m$-dimensional signal we wish to recover. We introduce a family of algorithms that under mild assumptions recover the signal $X*$ in all estimation problems for which there exists a sum-of-squares algorithm.
arXiv Detail & Related papers (2022-11-14T13:09:12Z)
Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z)
Clustering with Queries under Semi-Random Noise [13.817228853960655]
We develop robust learning methods that tolerate general semi-random noise. We show that information theoretically $Oleft(fracnk log n (1-2p)2right)$ queries suffice to learn any cluster of sufficiently large size.
arXiv Detail & Related papers (2022-06-09T16:02:00Z)
ReLU Regression with Massart Noise [52.10842036932169]
We study the fundamental problem of ReLU regression, where the goal is to fit Rectified Linear Units (ReLUs) to data. We focus on ReLU regression in the Massart noise model, a natural and well-studied semi-random noise model. We develop an efficient algorithm that achieves exact parameter recovery in this model.
arXiv Detail & Related papers (2021-09-10T02:13:22Z)
Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment. Policy gradients for local search are often obtained from random perturbations. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z)
An Algorithm for Learning Smaller Representations of Models With Scarce Data [0.0]
We present an algorithm for solving binary classification problems when the dataset is not fully representative of the problem being solved.<n>Our algorithm works by reconstructing up to homology the manifold on which lies the support of the underlying distribution.
arXiv Detail & Related papers (2020-10-15T19:17:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.