On the convergence of group-sparse autoencoders
- URL: http://arxiv.org/abs/2102.07003v1
- Date: Sat, 13 Feb 2021 21:17:07 GMT
- Title: On the convergence of group-sparse autoencoders
- Authors: Emmanouil Theodosis, Bahareh Tolooshams, Pranay Tankala, Abiy Tasissa,
Demba Ba
- Abstract summary: We introduce and study a group-sparse autoencoder that accounts for a variety of generative models.
For clustering models, inputs that result in the same group of active units belong to the same cluster.
In this setting, we theoretically prove the convergence of the network parameters to a neighborhood of the generating matrix.
- Score: 9.393652136001732
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent approaches in the theoretical analysis of model-based deep learning
architectures have studied the convergence of gradient descent in shallow ReLU
networks that arise from generative models whose hidden layers are sparse.
Motivated by the success of architectures that impose structured forms of
sparsity, we introduce and study a group-sparse autoencoder that accounts for a
variety of generative models, and utilizes a group-sparse ReLU activation
function to force the non-zero units at a given layer to occur in blocks. For
clustering models, inputs that result in the same group of active units belong
to the same cluster. We proceed to analyze the gradient dynamics of a shallow
instance of the proposed autoencoder, trained with data adhering to a
group-sparse generative model. In this setting, we theoretically prove the
convergence of the network parameters to a neighborhood of the generating
matrix. We validate our model through numerical analysis and highlight the
superior performance of networks with a group-sparse ReLU compared to networks
that utilize traditional ReLUs, both in sparse coding and in parameter recovery
tasks. We also provide real data experiments to corroborate the simulated
results, and emphasize the clustering capabilities of structured sparsity
models.
Related papers
- Hierarchical Clustering for Conditional Diffusion in Image Generation [12.618079575423868]
This paper introduces TreeDiffusion, a deep generative model that conditions Diffusion Models on hierarchical clusters to obtain high-quality, cluster-specific generations.
The proposed pipeline consists of two steps: a VAE-based clustering model that learns the hierarchical structure of the data, and a conditional diffusion model that generates realistic images for each cluster.
arXiv Detail & Related papers (2024-10-22T11:35:36Z) - Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - Learning Coherent Clusters in Weakly-Connected Network Systems [7.766921168069532]
We propose a structure-preserving model methodology for large-scale dynamic networks with tightly-connected components.
We provide an upper bound on the approximation error when the network graph is randomly generated from a weight block model.
arXiv Detail & Related papers (2022-11-28T13:32:25Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Enhancing Latent Space Clustering in Multi-filter Seq2Seq Model: A
Reinforcement Learning Approach [0.0]
We design a latent-enhanced multi-filter seq2seq model (LMS2S) that analyzes the latent space representations using a clustering algorithm.
Our experiments on semantic parsing and machine translation demonstrate the positive correlation between the clustering quality and the model's performance.
arXiv Detail & Related papers (2021-09-25T16:36:31Z) - Deep adaptive fuzzy clustering for evolutionary unsupervised
representation learning [2.8028128734158164]
Cluster assignment of large and complex images is a crucial but challenging task in pattern recognition and computer vision.
We present a novel evolutionary unsupervised learning representation model with iterative optimization.
We jointly fuzzy clustering to the deep reconstruction model, in which fuzzy membership is utilized to represent a clear structure of deep cluster assignments.
arXiv Detail & Related papers (2021-03-31T13:58:10Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - LieTransformer: Equivariant self-attention for Lie Groups [49.9625160479096]
Group equivariant neural networks are used as building blocks of group invariant neural networks.
We extend the scope of the literature to self-attention, that is emerging as a prominent building block of deep learning models.
We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups.
arXiv Detail & Related papers (2020-12-20T11:02:49Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z) - Hierarchical regularization networks for sparsification based learning
on noisy datasets [0.0]
hierarchy follows from approximation spaces identified at successively finer scales.
For promoting model generalization at each scale, we also introduce a novel, projection based penalty operator across multiple dimension.
Results show the performance of the approach as a data reduction and modeling strategy on both synthetic and real datasets.
arXiv Detail & Related papers (2020-06-09T18:32:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.