Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix
Factorization
- URL: http://arxiv.org/abs/2306.01869v1
- Date: Fri, 2 Jun 2023 18:55:27 GMT
- Title: Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix
Factorization
- Authors: Ameya Velingker, Maximilian V\"otsch, David P. Woodruff, Samson Zhou
- Abstract summary: We introduce efficient $(1+varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem.
The goal is to approximate $mathbfA$ as a product of low-rank factors.
Our techniques generalize to other common variants of the BMF problem.
- Score: 54.29685789885059
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce efficient $(1+\varepsilon)$-approximation algorithms for the
binary matrix factorization (BMF) problem, where the inputs are a matrix
$\mathbf{A}\in\{0,1\}^{n\times d}$, a rank parameter $k>0$, as well as an
accuracy parameter $\varepsilon>0$, and the goal is to approximate $\mathbf{A}$
as a product of low-rank factors $\mathbf{U}\in\{0,1\}^{n\times k}$ and
$\mathbf{V}\in\{0,1\}^{k\times d}$. Equivalently, we want to find $\mathbf{U}$
and $\mathbf{V}$ that minimize the Frobenius loss $\|\mathbf{U}\mathbf{V} -
\mathbf{A}\|_F^2$. Before this work, the state-of-the-art for this problem was
the approximation algorithm of Kumar et. al. [ICML 2019], which achieves a
$C$-approximation for some constant $C\ge 576$. We give the first
$(1+\varepsilon)$-approximation algorithm using running time singly exponential
in $k$, where $k$ is typically a small integer. Our techniques generalize to
other common variants of the BMF problem, admitting bicriteria
$(1+\varepsilon)$-approximation algorithms for $L_p$ loss functions and the
setting where matrix operations are performed in $\mathbb{F}_2$. Our approach
can be implemented in standard big data models, such as the streaming or
distributed models.
Related papers
- Efficient Continual Finite-Sum Minimization [52.5238287567572]
We propose a key twist into the finite-sum minimization, dubbed as continual finite-sum minimization.
Our approach significantly improves upon the $mathcalO(n/epsilon)$ FOs that $mathrmStochasticGradientDescent$ requires.
We also prove that there is no natural first-order method with $mathcalOleft(n/epsilonalpharight)$ complexity gradient for $alpha 1/4$, establishing that the first-order complexity of our method is nearly tight.
arXiv Detail & Related papers (2024-06-07T08:26:31Z) - Sample-Efficient Linear Regression with Self-Selection Bias [7.605563562103568]
We consider the problem of linear regression with self-selection bias in the unknown-index setting.
We provide a novel and near optimally sample-efficient (in terms of $k$) algorithm to recover $mathbfw_1,ldots,mathbfw_kin.
Our algorithm succeeds under significantly relaxed noise assumptions, and therefore also succeeds in the related setting of max-linear regression.
arXiv Detail & Related papers (2024-02-22T02:20:24Z) - Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples.
We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z) - Optimal Estimator for Linear Regression with Shuffled Labels [17.99906229036223]
This paper considers the task of linear regression with shuffled labels.
$mathbf Y in mathbb Rntimes m, mathbf Pi in mathbb Rntimes p, mathbf B in mathbb Rptimes m$, and $mathbf Win mathbb Rntimes m$, respectively.
arXiv Detail & Related papers (2023-10-02T16:44:47Z) - Solving Regularized Exp, Cosh and Sinh Regression Problems [40.47799094316649]
attention computation is a fundamental task for large language models such as Transformer, GPT-4 and ChatGPT.
The straightforward method is to use the naive Newton's method.
arXiv Detail & Related papers (2023-03-28T04:26:51Z) - An Optimal Algorithm for Strongly Convex Min-min Optimization [79.11017157526815]
Existing optimal first-order methods require $mathcalO(sqrtmaxkappa_x,kappa_y log 1/epsilon)$ of computations of both $nabla_x f(x,y)$ and $nabla_y f(x,y)$.
We propose a new algorithm that only requires $mathcalO(sqrtkappa_x log 1/epsilon)$ of computations of $nabla_x f(x,
arXiv Detail & Related papers (2022-12-29T19:26:12Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Sketching Algorithms and Lower Bounds for Ridge Regression [65.0720777731368]
We give a sketching-based iterative algorithm that computes $1+varepsilon$ approximate solutions for the ridge regression problem.
We also show that this algorithm can be used to give faster algorithms for kernel ridge regression.
arXiv Detail & Related papers (2022-04-13T22:18:47Z) - Locally Private $k$-Means Clustering with Constant Multiplicative
Approximation and Near-Optimal Additive Error [10.632986841188]
We bridge the gap between the exponents of $n$ in the upper and lower bounds on the additive error with two new algorithms.
It is possible to solve the locally private $k$-means problem in a constant number of rounds with constant factor multiplicative approximation.
arXiv Detail & Related papers (2021-05-31T14:41:40Z) - Algorithms and Hardness for Linear Algebra on Geometric Graphs [14.822517769254352]
We show that the exponential dependence on the dimension dimension $d in the celebrated fast multipole method of Greengard and Rokhlin cannot be improved.
This is the first formal limitation proven about fast multipole methods.
arXiv Detail & Related papers (2020-11-04T18:35:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.