Shuffled linear regression through graduated convex relaxation
- URL: http://arxiv.org/abs/2209.15608v1
- Date: Fri, 30 Sep 2022 17:33:48 GMT
- Title: Shuffled linear regression through graduated convex relaxation
- Authors: Efe Onaran, Soledad Villar
- Abstract summary: The shuffled linear regression problem aims to recover linear relationships in datasets where the correspondence between input and output is unknown.
This problem arises in a wide range of applications including survey data.
We propose a novel optimization algorithm for shuffled linear regression based on a posterior-maximizing objective function.
- Score: 12.614901374282868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The shuffled linear regression problem aims to recover linear relationships
in datasets where the correspondence between input and output is unknown. This
problem arises in a wide range of applications including survey data, in which
one needs to decide whether the anonymity of the responses can be preserved
while uncovering significant statistical connections. In this work, we propose
a novel optimization algorithm for shuffled linear regression based on a
posterior-maximizing objective function assuming Gaussian noise prior. We
compare and contrast our approach with existing methods on synthetic and real
data. We show that our approach performs competitively while achieving
empirical running-time improvements. Furthermore, we demonstrate that our
algorithm is able to utilize the side information in the form of seeds, which
recently came to prominence in related problems.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Anchor Data Augmentation [53.39044919864444]
We propose a novel algorithm for data augmentation in nonlinear over-parametrized regression.
Our data augmentation algorithm borrows from the literature on causality and extends the recently proposed Anchor regression (AR) method for data augmentation.
arXiv Detail & Related papers (2023-11-12T21:08:43Z) - Refining Amortized Posterior Approximations using Gradient-Based Summary
Statistics [0.9176056742068814]
We present an iterative framework to improve the amortized approximations of posterior distributions in the context of inverse problems.
We validate our method in a controlled setting by applying it to a stylized problem, and observe improved posterior approximations with each iteration.
arXiv Detail & Related papers (2023-05-15T15:47:19Z) - A Bayesian Robust Regression Method for Corrupted Data Reconstruction [5.298637115178182]
We develop an effective robust regression method that can resist adaptive adversarial attacks.
First, we propose the novel TRIP (hard Thresholding approach to Robust regression with sImple Prior) algorithm.
We then use the idea of Bayesian reweighting to construct the more robust BRHT (robust Bayesian Reweighting regression via Hard Thresholding) algorithm.
arXiv Detail & Related papers (2022-12-24T17:25:53Z) - On Optimal Interpolation In Linear Regression [22.310861786709538]
We show that the optimal way to interpolate in linear regression is to use functions that are linear in the response variable.
We identify a regime where the minimum-norm interpolator provably generalizes arbitrarily worse than the optimal response-linear achievable interpolator.
We extend the notion of optimal response-linear to random features regression under a linear data-generating model.
arXiv Detail & Related papers (2021-10-21T16:37:10Z) - Linear regression with partially mismatched data: local search with
theoretical guarantees [9.398989897176953]
We study an important variant of linear regression in which the predictor-response pairs are partially mismatched.
We use an optimization formulation to simultaneously learn the underlying regression coefficients and the permutation corresponding to the mismatches.
We prove that our local search algorithm converges to a nearly-optimal solution at a linear rate.
arXiv Detail & Related papers (2021-06-03T23:32:12Z) - A Hypergradient Approach to Robust Regression without Correspondence [85.49775273716503]
We consider a variant of regression problem, where the correspondence between input and output data is not available.
Most existing methods are only applicable when the sample size is small.
We propose a new computational framework -- ROBOT -- for the shuffled regression problem.
arXiv Detail & Related papers (2020-11-30T21:47:38Z) - A spectral algorithm for robust regression with subgaussian rates [0.0]
We study a new linear up to quadratic time algorithm for linear regression in the absence of strong assumptions on the underlying distributions of samples.
The goal is to design a procedure which attains the optimal sub-gaussian error bound even though the data have only finite moments.
arXiv Detail & Related papers (2020-07-12T19:33:50Z) - Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data.
We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z) - Bandits with Partially Observable Confounded Data [74.04376842070624]
We show that this problem is closely related to a variant of the bandit problem with side information.
We construct a linear bandit algorithm that takes advantage of the projected information, and prove regret bounds.
Our results indicate that confounded offline data can significantly improve online learning algorithms.
arXiv Detail & Related papers (2020-06-11T18:48:03Z) - Optimizing for the Future in Non-Stationary MDPs [52.373873622008944]
We present a policy gradient algorithm that maximizes a forecast of future performance.
We show that our algorithm, called Prognosticator, is more robust to non-stationarity than two online adaptation techniques.
arXiv Detail & Related papers (2020-05-17T03:41:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.