PUDLE: Implicit Acceleration of Dictionary Learning by Backpropagation
- URL: http://arxiv.org/abs/2106.00058v1
- Date: Mon, 31 May 2021 18:49:58 GMT
- Title: PUDLE: Implicit Acceleration of Dictionary Learning by Backpropagation
- Authors: Bahareh Tolooshams and Demba Ba
- Abstract summary: This paper offers the first theoretical proof for empirical results through PUDLE, a Provable Unfolded Dictionary LEarning method.
We highlight the minimization impact of loss, unfolding, and backpropagation on convergence.
We complement our findings through synthetic and image denoising experiments.
- Score: 4.081440927534577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The dictionary learning problem, representing data as a combination of few
atoms, has long stood as a popular method for learning representations in
statistics and signal processing. The most popular dictionary learning
algorithm alternates between sparse coding and dictionary update steps, and a
rich literature has studied its theoretical convergence. The growing popularity
of neurally plausible unfolded sparse coding networks has led to the empirical
finding that backpropagation through such networks performs dictionary
learning. This paper offers the first theoretical proof for these empirical
results through PUDLE, a Provable Unfolded Dictionary LEarning method. We
highlight the impact of loss, unfolding, and backpropagation on convergence. We
discover an implicit acceleration: as a function of unfolding, the
backpropagated gradient converges faster and is more accurate than the gradient
from alternating minimization. We complement our findings through synthetic and
image denoising experiments. The findings support the use of accelerated deep
learning optimizers and unfolded networks for dictionary learning.
Related papers
- An Analysis of BPE Vocabulary Trimming in Neural Machine Translation [56.383793805299234]
vocabulary trimming is a postprocessing step that replaces rare subwords with their component subwords.
We show that vocabulary trimming fails to improve performance and is even prone to incurring heavy degradation.
arXiv Detail & Related papers (2024-03-30T15:29:49Z) - Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic
Interpretability: A Case Study on Othello-GPT [59.245414547751636]
We propose a circuit discovery framework alternative to activation patching.
Our framework suffers less from out-of-distribution and proves to be more efficient in terms of complexity.
We dig in a small transformer trained on a synthetic task named Othello and find a number of human-understandable fine-grained circuits inside of it.
arXiv Detail & Related papers (2024-02-19T15:04:53Z) - Explainable Trajectory Representation through Dictionary Learning [7.567576186354494]
Trajectory representation learning on a network enhances our understanding of vehicular traffic patterns.
Existing approaches using classic machine learning or deep learning embed trajectories as dense vectors, which lack interpretability.
This paper proposes an explainable trajectory representation learning framework through dictionary learning.
arXiv Detail & Related papers (2023-12-13T10:59:54Z) - Bayesian sparsity and class sparsity priors for dictionary learning and
coding [0.0]
We propose a work flow to facilitate the dictionary matching process.
In this article, we propose a new Bayesian data-driven group sparsity coding method to help identify subdictionaries that are not relevant for the dictionary matching.
The effectiveness of compensating for the dictionary compression error and using the novel group sparsity promotion to deflate the original dictionary are illustrated.
arXiv Detail & Related papers (2023-09-02T17:54:23Z) - Hiding Data Helps: On the Benefits of Masking for Sparse Coding [22.712098918769243]
We show that in the presence of noise, minimizing the standard dictionary learning objective can fail to recover the elements of the ground-truth dictionary in the over-realized regime.
We propose a novel masking objective for which recovering the ground-truth dictionary is in fact optimal as the signal increases for a large class of data-generating processes.
arXiv Detail & Related papers (2023-02-24T16:16:19Z) - Efficient CNN with uncorrelated Bag of Features pooling [98.78384185493624]
Bag of Features (BoF) has been recently proposed to reduce the complexity of convolution layers.
We propose an approach that builds on top of BoF pooling to boost its efficiency by ensuring that the items of the learned dictionary are non-redundant.
The proposed strategy yields an efficient variant of BoF and further boosts its performance, without any additional parameters.
arXiv Detail & Related papers (2022-09-22T09:00:30Z) - Better Language Model with Hypernym Class Prediction [101.8517004687825]
Class-based language models (LMs) have been long devised to address context sparsity in $n$-gram LMs.
In this study, we revisit this approach in the context of neural LMs.
arXiv Detail & Related papers (2022-03-21T01:16:44Z) - Discriminative Dictionary Learning based on Statistical Methods [0.0]
Sparse Representation (SR) of signals or data has a well founded theory with rigorous mathematical error bounds and proofs.
Training dictionaries such that they represent each class of signals with minimal loss is called Dictionary Learning (DL)
MOD and K-SVD have been successfully used in reconstruction based applications in image processing like image "denoising", "inpainting"
arXiv Detail & Related papers (2021-11-17T10:45:10Z) - Exact Sparse Orthogonal Dictionary Learning [8.577876545575828]
We find that our method can result in better denoising results than over-complete dictionary based learning methods.
Our method has the additional advantage of high efficiency.
arXiv Detail & Related papers (2021-03-14T07:51:32Z) - DeepReduce: A Sparse-tensor Communication Framework for Distributed Deep
Learning [79.89085533866071]
This paper introduces DeepReduce, a versatile framework for the compressed communication of sparse tensors.
DeepReduce decomposes tensors in two sets, values and indices, and allows both independent and combined compression of these sets.
Our experiments with large real models demonstrate that DeepReduce transmits fewer data and imposes lower computational overhead than existing methods.
arXiv Detail & Related papers (2021-02-05T11:31:24Z) - When Dictionary Learning Meets Deep Learning: Deep Dictionary Learning
and Coding Network for Image Recognition with Limited Data [74.75557280245643]
We present a new Deep Dictionary Learning and Coding Network (DDLCN) for image recognition tasks with limited data.
We empirically compare DDLCN with several leading dictionary learning methods and deep learning models.
Experimental results on five popular datasets show that DDLCN achieves competitive results compared with state-of-the-art methods when the training data is limited.
arXiv Detail & Related papers (2020-05-21T23:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.