Towards improving discriminative reconstruction via simultaneous dense
and sparse coding
- URL: http://arxiv.org/abs/2006.09534v2
- Date: Mon, 10 May 2021 05:05:01 GMT
- Title: Towards improving discriminative reconstruction via simultaneous dense
and sparse coding
- Authors: Abiy Tasissa, Emmanouil Theodosis, Bahareh Tolooshams, and Demba Ba
- Abstract summary: Discriminative features extracted from the sparse coding model have been shown to perform well for classification and reconstruction.
We propose a novel dense and sparse coding model that integrates both representation capability and discriminative features.
- Score: 9.87575928269854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Discriminative features extracted from the sparse coding model have been
shown to perform well for classification and reconstruction. Recent deep
learning architectures have further improved reconstruction in inverse problems
by considering new dense priors learned from data. We propose a novel dense and
sparse coding model that integrates both representation capability and
discriminative features. The model considers the problem of recovering a dense
vector $\mathbf{x}$ and a sparse vector $\mathbf{u}$ given measurements of the
form $\mathbf{y} = \mathbf{A}\mathbf{x}+\mathbf{B}\mathbf{u}$. Our first
analysis proposes a natural geometric condition based on the minimal angle
between spanning subspaces corresponding to the measurement matrices
$\mathbf{A}$ and $\mathbf{B}$ to establish the uniqueness of solutions to the
linear system. The second analysis shows that, under mild assumptions, a convex
program recovers the dense and sparse components. We validate the effectiveness
of the proposed model on simulated data and propose a dense and sparse
autoencoder (DenSaE) tailored to learning the dictionaries from the dense and
sparse model. We demonstrate that a) DenSaE denoises natural images better than
architectures derived from the sparse coding model ($\mathbf{B}\mathbf{u}$), b)
in the presence of noise, training the biases in the latter amounts to
implicitly learning the $\mathbf{A}\mathbf{x} + \mathbf{B}\mathbf{u}$ model, c)
$\mathbf{A}$ and $\mathbf{B}$ capture low- and high-frequency contents,
respectively, and d) compared to the sparse coding model, DenSaE offers a
balance between discriminative power and representation.
Related papers
- Injecting Measurement Information Yields a Fast and Noise-Robust Diffusion-Based Inverse Problem Solver [20.959606647379356]
We propose to estimate the conditional posterior mean $mathbbE [mathbfx_t, mathbfy]$.<n>The resulting prediction can be integrated into any standard sampler, resulting in a fast and memory-efficient inverse solver.
arXiv Detail & Related papers (2025-08-05T00:01:41Z) - Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models [65.71506381302815]
We propose amortize the cost of sampling from a posterior distribution of the form $p(mathbfxmidmathbfy) propto p_theta(mathbfx)$.
For many models and constraints of interest, the posterior in the noise space is smoother than the posterior in the data space, making it more amenable to such amortized inference.
arXiv Detail & Related papers (2025-02-10T19:49:54Z) - Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis [55.561961365113554]
3D Gaussian Splatting (3DGS) has demonstrated remarkable effectiveness for novel view synthesis (NVS)
However, the 3DGS model tends to overfit when trained with sparse posed views, limiting its generalization ability to novel views.
We present a Self-Ensembling Gaussian Splatting (SE-GS) approach to alleviate the overfitting problem.
Our approach improves NVS quality with few-shot training views, outperforming existing state-of-the-art methods.
arXiv Detail & Related papers (2024-10-31T18:43:48Z) - Compressing Large Language Models using Low Rank and Low Precision Decomposition [46.30918750022739]
This work introduces $rm CALDERA$ -- a new post-training LLM compression algorithm.
It harnesses the inherent low-rank structure of a weight matrix $mathbfW$ by approximating it via a low-rank, low-precision decomposition.
Results show that compressing LlaMa-$2$ $7$B/$13B$/$70$B and LlaMa-$3$ $8$B models using $rm CALDERA$ outperforms existing post-training compression techniques.
arXiv Detail & Related papers (2024-05-29T08:42:30Z) - Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples.
We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z) - Compressive Recovery of Sparse Precision Matrices [5.557600489035657]
We consider the problem of learning a graph modeling the statistical relations of the $d$ variables from a dataset with $n$ samples $X in mathbbRn times d$.
We show that it is possible to estimate it from a sketch of size $m=Omegaleft((d+2k)log(d)right)$ where $k$ is the maximal number of edges of the underlying graph.
We investigate the possibility of achieving practical recovery with an iterative algorithm based on the graphical lasso, viewed as a specific denoiser.
arXiv Detail & Related papers (2023-11-08T13:29:08Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - Diffusion models as plug-and-play priors [98.16404662526101]
We consider the problem of inferring high-dimensional data $mathbfx$ in a model that consists of a prior $p(mathbfx)$ and an auxiliary constraint $c(mathbfx,mathbfy)$.
The structure of diffusion models allows us to perform approximate inference by iterating differentiation through the fixed denoising network enriched with different amounts of noise.
arXiv Detail & Related papers (2022-06-17T21:11:36Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Minimax Optimal Quantization of Linear Models: Information-Theoretic
Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements.
We derive an information-theoretic lower bound for the minimax risk under this setting.
We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z) - Universal Regular Conditional Distributions via Probability
Measure-Valued Deep Neural Models [3.8073142980733]
We find that any model built using the proposed framework is dense in the space $C(mathcalX,mathcalP_1(mathcalY))$.
The proposed models are also shown to be capable of generically expressing the aleatoric uncertainty present in most randomized machine learning models.
arXiv Detail & Related papers (2021-05-17T11:34:09Z) - Nonparametric Learning of Two-Layer ReLU Residual Units [22.870658194212744]
We describe an algorithm that learns two-layer residual units with rectified linear unit (ReLU) activation.
We design layer-wise objectives as functionals whose analytic minimizers express the exact ground-truth network in terms of its parameters and nonlinearities.
We prove the statistical strong consistency of our algorithm, and demonstrate the robustness and sample efficiency of our algorithm by experiments.
arXiv Detail & Related papers (2020-08-17T22:11:26Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.