Towards Designing Optimal Sensing Matrices for Generalized Linear
Inverse Problems
- URL: http://arxiv.org/abs/2111.03237v3
- Date: Sun, 20 Aug 2023 02:02:10 GMT
- Title: Towards Designing Optimal Sensing Matrices for Generalized Linear
Inverse Problems
- Authors: Junjie Ma, Ji Xu, Arian Maleki
- Abstract summary: We consider an inverse problem $mathbfy= f(mathbfAx)$, where $mathbfxinmathbbRn$ is the signal of interest.
We show that whether a spikier spectrum can hurt or help the recovery performance depends on $f$.
- Score: 26.251298081065304
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider an inverse problem $\mathbf{y}= f(\mathbf{Ax})$, where
$\mathbf{x}\in\mathbb{R}^n$ is the signal of interest, $\mathbf{A}$ is the
sensing matrix, $f$ is a nonlinear function and $\mathbf{y} \in \mathbb{R}^m$
is the measurement vector. In many applications, we have some level of freedom
to design the sensing matrix $\mathbf{A}$, and in such circumstances we could
optimize $\mathbf{A}$ to achieve better reconstruction performance. As a first
step towards optimal design, it is important to understand the impact of the
sensing matrix on the difficulty of recovering $\mathbf{x}$ from $\mathbf{y}$.
In this paper, we study the performance of one of the most successful
recovery methods, i.e., the expectation propagation (EP) algorithm. We define a
notion of spikiness for the spectrum of $\bmmathbfA}$ and show the importance
of this measure for the performance of EP. We show that whether a spikier
spectrum can hurt or help the recovery performance depends on $f$. Based on our
framework, we are able to show that, in phase-retrieval problems, matrices with
spikier spectrums are better for EP, while in 1-bit compressed sensing
problems, less spiky spectrums lead to better performance. Our results unify
and substantially generalize existing results that compare Gaussian and
orthogonal matrices, and provide a platform towards designing optimal sensing
systems.
Related papers
- Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks [46.04785603483612]
We prove that Nesterov's accelerated gradient attains an complexity $O(kappalogfrac1epsilon)$.
In particular, we prove that NAG can also attain an accelerated linear convergence rate.
arXiv Detail & Related papers (2024-10-12T20:33:37Z) - Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization [27.907707931902547]
We investigate two phenomena related to the attention mechanism during the fine-tuning of Large Language Models.
The first phenomenon, termed "Unequal Importance of Attention Matrices," highlights the impact of fine-tuning different weight matrices.
The second phenomenon, "Attention Matrices with Customized Learning Rate Leads to Better Convergence," emphasizes the importance of assigning distinct learning rates.
arXiv Detail & Related papers (2024-10-03T06:37:37Z) - Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Efficient Matrix Factorization Via Householder Reflections [2.3326951882644553]
We show that the exact recovery of the factors $mathbfH$ and $mathbfX$ from $mathbfY$ is guaranteed with $Omega$ columns in $mathbfY$.
We hope the techniques in this work help in developing alternate algorithms for dictionary learning.
arXiv Detail & Related papers (2024-05-13T11:13:49Z) - Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples.
We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z) - Structured Semidefinite Programming for Recovering Structured
Preconditioners [41.28701750733703]
We give an algorithm which, given positive definite $mathbfK in mathbbRd times d$ with $mathrmnnz(mathbfK)$ nonzero entries, computes an $epsilon$-optimal diagonal preconditioner in time.
We attain our results via new algorithms for a class of semidefinite programs we call matrix-dictionary approximation SDPs.
arXiv Detail & Related papers (2023-10-27T16:54:29Z) - A General Algorithm for Solving Rank-one Matrix Sensing [15.543065204102714]
The goal of matrix sensing is to recover a matrix $A_star in mathbbRn times n$, based on a sequence of measurements.
In this paper, we relax that rank-$k$ assumption and solve a much more general matrix sensing problem.
arXiv Detail & Related papers (2023-03-22T04:07:26Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix
Factorization [49.090785356633695]
We study the asymmetric low-rank factorization problem: [mathbfU in mathbbRm min d, mathbfU$ and mathV$.
arXiv Detail & Related papers (2021-06-27T17:25:24Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - Nonparametric Learning of Two-Layer ReLU Residual Units [22.870658194212744]
We describe an algorithm that learns two-layer residual units with rectified linear unit (ReLU) activation.
We design layer-wise objectives as functionals whose analytic minimizers express the exact ground-truth network in terms of its parameters and nonlinearities.
We prove the statistical strong consistency of our algorithm, and demonstrate the robustness and sample efficiency of our algorithm by experiments.
arXiv Detail & Related papers (2020-08-17T22:11:26Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.