Learning and Testing Variable Partitions
- URL: http://arxiv.org/abs/2003.12990v1
- Date: Sun, 29 Mar 2020 10:12:32 GMT
- Title: Learning and Testing Variable Partitions
- Authors: Andrej Bogdanov and Baoxiang Wang
- Abstract summary: We show that $mathcalO(k n2)(delta + epsilon)$ can be learned in time $tildemathcalO(n2 mathrmpoly (1/epsilon)$ for any $epsilon > 0$.
We also show that even two-sided testers require $Omega(n)$ queries when $k = 2$.
- Score: 13.575794982844222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: $ $Let $F$ be a multivariate function from a product set $\Sigma^n$ to an
Abelian group $G$. A $k$-partition of $F$ with cost $\delta$ is a partition of
the set of variables $\mathbf{V}$ into $k$ non-empty subsets $(\mathbf{X}_1,
\dots, \mathbf{X}_k)$ such that $F(\mathbf{V})$ is $\delta$-close to
$F_1(\mathbf{X}_1)+\dots+F_k(\mathbf{X}_k)$ for some $F_1, \dots, F_k$ with
respect to a given error metric. We study algorithms for agnostically learning
$k$ partitions and testing $k$-partitionability over various groups and error
metrics given query access to $F$. In particular we show that
$1.$ Given a function that has a $k$-partition of cost $\delta$, a partition
of cost $\mathcal{O}(k n^2)(\delta + \epsilon)$ can be learned in time
$\tilde{\mathcal{O}}(n^2 \mathrm{poly} (1/\epsilon))$ for any $\epsilon > 0$.
In contrast, for $k = 2$ and $n = 3$ learning a partition of cost $\delta +
\epsilon$ is NP-hard.
$2.$ When $F$ is real-valued and the error metric is the 2-norm, a
2-partition of cost $\sqrt{\delta^2 + \epsilon}$ can be learned in time
$\tilde{\mathcal{O}}(n^5/\epsilon^2)$.
$3.$ When $F$ is $\mathbb{Z}_q$-valued and the error metric is Hamming
weight, $k$-partitionability is testable with one-sided error and
$\mathcal{O}(kn^3/\epsilon)$ non-adaptive queries. We also show that even
two-sided testers require $\Omega(n)$ queries when $k = 2$.
This work was motivated by reinforcement learning control tasks in which the
set of control variables can be partitioned. The partitioning reduces the task
into multiple lower-dimensional ones that are relatively easier to learn. Our
second algorithm empirically increases the scores attained over previous
heuristic partitioning methods applied in this context.
Related papers
- The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions [54.54897832889028]
We show that for any $K$, there is a universal set" $U subset [n]$ of size independent of $n$, such that for any $Q$ and any row $i$, the large attention scores $A_i,j$ in row $i$ of $A$ all have $jin U$.
We empirically show the benefits of our scheme for vision transformers, showing how to train new models that use our universal set while training as well.
arXiv Detail & Related papers (2024-10-07T19:47:13Z) - Outlier Robust Multivariate Polynomial Regression [27.03423421704806]
We are given a set of random samples $(mathbfx_i,y_i) in [-1,1]n times mathbbR$ that are noisy versions of $(mathbfx_i,p(mathbfx_i)$.
The goal is to output a $hatp$, within an $ell_in$-distance of at most $O(sigma)$ from $p$.
arXiv Detail & Related papers (2024-03-14T15:04:45Z) - $\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model.
For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits.
For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z) - Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix
Factorization [54.29685789885059]
We introduce efficient $(1+varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem.
The goal is to approximate $mathbfA$ as a product of low-rank factors.
Our techniques generalize to other common variants of the BMF problem.
arXiv Detail & Related papers (2023-06-02T18:55:27Z) - Phase Transitions in the Detection of Correlated Databases [12.010807505655238]
We study the problem of detecting the correlation between two Gaussian databases $mathsfXinmathbbRntimes d$ and $mathsfYntimes d$, each composed of $n$ users with $d$ features.
This problem is relevant in the analysis of social media, computational biology, etc.
arXiv Detail & Related papers (2023-02-07T10:39:44Z) - Low-Rank Approximation with $1/\epsilon^{1/3}$ Matrix-Vector Products [58.05771390012827]
We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-$p$ norm.
Our main result is an algorithm that uses only $tildeO(k/sqrtepsilon)$ matrix-vector products.
arXiv Detail & Related papers (2022-02-10T16:10:41Z) - Fair Representation Clustering with Several Protected Classes [13.53362222844008]
We study the problem of fair $k$-median where each cluster is required to have a fair representation of individuals from different groups.
We present an $O(log k)$-approximation algorithm that runs in time $nO(ell)$.
arXiv Detail & Related papers (2022-02-03T03:45:45Z) - Mediated Uncoupled Learning: Learning Functions without Direct
Input-output Correspondences [80.95776331769899]
We consider the task of predicting $Y$ from $X$ when we have no paired data of them.
A naive approach is to predict $U$ from $X$ using $S_X$ and then $Y$ from $U$ using $S_Y$.
We propose a new method that avoids predicting $U$ but directly learns $Y = f(X)$ by training $f(X)$ with $S_X$ to predict $h(U)$.
arXiv Detail & Related papers (2021-07-16T22:13:29Z) - Sharper bounds for online learning of smooth functions of a single
variable [0.0]
We show that $opt_1+epsilon(mathcalF_q) = Theta(epsilon-frac12)$, where the constants in the bound do not depend on $q$.
We also show that $opt_1+epsilon(mathcalF_q) = Theta(epsilon-frac12)$, where the constants in the bound do not depend on $q$.
arXiv Detail & Related papers (2021-05-30T23:06:21Z) - The planted matching problem: Sharp threshold and infinite-order phase
transition [25.41713098167692]
We study the problem of reconstructing a perfect matching $M*$ hidden in a randomly weighted $ntimes n$ bipartite graph.
We show that if $sqrtd B(mathcalP,mathcalQ) ge 1+epsilon$ for an arbitrarily small constant $epsilon>0$, the reconstruction error for any estimator is shown to be bounded away from $0$.
arXiv Detail & Related papers (2021-03-17T00:59:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.