Non-linear, Sparse Dimensionality Reduction via Path Lasso Penalized
Autoencoders
- URL: http://arxiv.org/abs/2102.10873v1
- Date: Mon, 22 Feb 2021 10:14:46 GMT
- Title: Non-linear, Sparse Dimensionality Reduction via Path Lasso Penalized
Autoencoders
- Authors: Oskar Allerbo, Rebecka J\"ornsten
- Abstract summary: We present path lasso penalized autoencoders for complex data structures.
Our algorithm uses a group lasso penalty and non-negative matrix factorization to construct a sparse, non-linear latent representation.
We show that the algorithm exhibits much lower reconstruction errors than sparse PCA and parameter-wise lasso regularized autoencoders for low-dimensional representations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-dimensional data sets are often analyzed and explored via the
construction of a latent low-dimensional space which enables convenient
visualization and efficient predictive modeling or clustering. For complex data
structures, linear dimensionality reduction techniques like PCA may not be
sufficiently flexible to enable low-dimensional representation. Non-linear
dimension reduction techniques, like kernel PCA and autoencoders, suffer from
loss of interpretability since each latent variable is dependent of all input
dimensions. To address this limitation, we here present path lasso penalized
autoencoders. This structured regularization enhances interpretability by
penalizing each path through the encoder from an input to a latent variable,
thus restricting how many input variables are represented in each latent
dimension. Our algorithm uses a group lasso penalty and non-negative matrix
factorization to construct a sparse, non-linear latent representation. We
compare the path lasso regularized autoencoder to PCA, sparse PCA, autoencoders
and sparse autoencoders on real and simulated data sets. We show that the
algorithm exhibits much lower reconstruction errors than sparse PCA and
parameter-wise lasso regularized autoencoders for low-dimensional
representations. Moreover, path lasso representations provide a more accurate
reconstruction match, i.e. preserved relative distance between objects in the
original and reconstructed spaces.
Related papers
- Differentiable VQ-VAE's for Robust White Matter Streamline Encodings [33.936125620525]
Autoencoders have been proposed as a dimension-reduction tool to simplify the analysis streamlines in a low-dimensional latent spaces.
We propose a novel Differentiable Vector Quantized Variational Autoencoder, which ingests entire bundles of streamlines as single data-point.
arXiv Detail & Related papers (2023-11-10T17:59:43Z) - Fundamental Limits of Two-layer Autoencoders, and Achieving Them with
Gradient Methods [91.54785981649228]
This paper focuses on non-linear two-layer autoencoders trained in the challenging proportional regime.
Our results characterize the minimizers of the population risk, and show that such minimizers are achieved by gradient methods.
For the special case of a sign activation function, our analysis establishes the fundamental limits for the lossy compression of Gaussian sources via (shallow) autoencoders.
arXiv Detail & Related papers (2022-12-27T12:37:34Z) - Convergent autoencoder approximation of low bending and low distortion
manifold embeddings [5.5711773076846365]
We propose and analyze a novel regularization for learning the encoder component of an autoencoder.
The loss functional is computed via Monte Carlo integration with different sampling strategies for pairs of points on the input manifold.
Our main theorem identifies a loss functional of the embedding map as the $Gamma$-limit of the sampling-dependent loss functionals.
arXiv Detail & Related papers (2022-08-22T10:31:31Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks.
We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation.
We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Empirical comparison between autoencoders and traditional dimensionality
reduction methods [1.9290392443571387]
We evaluate the performance of PCA compared to Isomap, a deep autoencoder, and a variational autoencoder.
Experiments revealed that k-NN achieved comparable accuracy on PCA and both autoencoders' projections provided a big enough dimension.
arXiv Detail & Related papers (2021-03-08T16:26:43Z) - Metalearning: Sparse Variable-Structure Automata [0.0]
We propose a metalearning approach to increase the number of basis vectors used in dynamic sparse coding vectors on the fly.
An actor-critic algorithm is deployed to automatically choose an appropriate dimension for feature regarding the required level of accuracy.
arXiv Detail & Related papers (2021-01-30T21:32:23Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Autoencoder Image Interpolation by Shaping the Latent Space [12.482988592988868]
Autoencoders represent an effective approach for computing the underlying factors characterizing datasets of different types.
We propose a regularization technique that shapes the latent representation to follow a manifold consistent with the training images.
arXiv Detail & Related papers (2020-08-04T12:32:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.