It's Enough: Relaxing Diagonal Constraints in Linear Autoencoders for
Recommendation
- URL: http://arxiv.org/abs/2305.12922v1
- Date: Mon, 22 May 2023 11:09:49 GMT
- Title: It's Enough: Relaxing Diagonal Constraints in Linear Autoencoders for
Recommendation
- Authors: Jaewan Moon, Hye-young Kim, and Jongwuk Lee
- Abstract summary: This paper aims to theoretically understand the properties of two terms in linear autoencoders.
We propose simple-yet-effective linear autoencoder models using diagonal inequality constraints, called Relaxed Linear AutoEncoder (RLAE) and Relaxed Denoising Linear AutoEncoder (RDLAE)
Experimental results demonstrate that our models are comparable or superior to state-of-the-art linear and non-linear models on six benchmark datasets.
- Score: 4.8802420827610025
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Linear autoencoder models learn an item-to-item weight matrix via convex
optimization with L2 regularization and zero-diagonal constraints. Despite
their simplicity, they have shown remarkable performance compared to
sophisticated non-linear models. This paper aims to theoretically understand
the properties of two terms in linear autoencoders. Through the lens of
singular value decomposition (SVD) and principal component analysis (PCA), it
is revealed that L2 regularization enhances the impact of high-ranked PCs.
Meanwhile, zero-diagonal constraints reduce the impact of low-ranked PCs,
leading to performance degradation for unpopular items. Inspired by this
analysis, we propose simple-yet-effective linear autoencoder models using
diagonal inequality constraints, called Relaxed Linear AutoEncoder (RLAE) and
Relaxed Denoising Linear AutoEncoder (RDLAE). We prove that they generalize
linear autoencoders by adjusting the degree of diagonal constraints.
Experimental results demonstrate that our models are comparable or superior to
state-of-the-art linear and non-linear models on six benchmark datasets; they
significantly improve the accuracy of long-tail items. These results also
support our theoretical insights on regularization and diagonal constraints in
linear autoencoders.
Related papers
- DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs [56.24431208419858]
We introduce underlinetextbfDirect Preference Learning with Only underlinetextbfSelf-Generated underlinetextbfTests and underlinetextbfCode (DSTC)
DSTC uses only self-generated code snippets and tests to construct reliable preference pairs.
arXiv Detail & Related papers (2024-11-20T02:03:16Z) - Breaking the Low-Rank Dilemma of Linear Attention [61.55583836370135]
Linear attention provides a far more efficient solution by reducing the complexity to linear levels.
Our experiments indicate that this performance drop is due to the low-rank nature of linear attention's feature map.
We introduce Rank-Augmented Linear Attention (RALA), which rivals the performance of Softmax attention while maintaining linear complexity and high efficiency.
arXiv Detail & Related papers (2024-11-12T08:30:59Z) - Implicit ZCA Whitening Effects of Linear Autoencoders for Recommendation [10.374400063702392]
We show a connection between a linear autoencoder model and ZCA whitening for recommendation data.
We also show the correctness of applying a linear autoencoder to low-dimensional item vectors obtained using embedding methods.
arXiv Detail & Related papers (2023-08-15T07:58:22Z) - Fundamental Limits of Two-layer Autoencoders, and Achieving Them with
Gradient Methods [91.54785981649228]
This paper focuses on non-linear two-layer autoencoders trained in the challenging proportional regime.
Our results characterize the minimizers of the population risk, and show that such minimizers are achieved by gradient methods.
For the special case of a sign activation function, our analysis establishes the fundamental limits for the lossy compression of Gaussian sources via (shallow) autoencoders.
arXiv Detail & Related papers (2022-12-27T12:37:34Z) - Sketching as a Tool for Understanding and Accelerating Self-attention
for Long Sequences [52.6022911513076]
Transformer-based models are not efficient in processing long sequences due to the quadratic space and time complexity of the self-attention modules.
We propose Linformer and Informer to reduce the quadratic complexity to linear (modulo logarithmic factors) via low-dimensional projection and row selection.
Based on the theoretical analysis, we propose Skeinformer to accelerate self-attention and further improve the accuracy of matrix approximation to self-attention.
arXiv Detail & Related papers (2021-12-10T06:58:05Z) - On the Regularization of Autoencoders [14.46779433267854]
We show that the unsupervised setting by itself induces strong additional regularization, i.e., a severe reduction in the model-capacity of the learned autoencoder.
We derive that a deep nonlinear autoencoder cannot fit the training data more accurately than a linear autoencoder does if both models have the same dimensionality in their last layer.
We demonstrate that it is an accurate approximation across all model-ranks in our experiments on three well-known data sets.
arXiv Detail & Related papers (2021-10-21T18:28:25Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - LQF: Linear Quadratic Fine-Tuning [114.3840147070712]
We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning.
LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification.
arXiv Detail & Related papers (2020-12-21T06:40:20Z) - An autoencoder-based reduced-order model for eigenvalue problems with
application to neutron diffusion [0.0]
Using an autoencoder for dimensionality reduction, this paper presents a novel projection-based reduced-order model for eigenvalue problems.
Reduced-order modelling relies on finding suitable basis functions which define a low-dimensional space in which a high-dimensional system is approximated.
arXiv Detail & Related papers (2020-08-15T16:52:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.