Related papers: It's Enough: Relaxing Diagonal Constraints in Linear Autoencoders for Recommendation

It's Enough: Relaxing Diagonal Constraints in Linear Autoencoders for Recommendation

URL: http://arxiv.org/abs/2305.12922v1
Date: Mon, 22 May 2023 11:09:49 GMT
Title: It's Enough: Relaxing Diagonal Constraints in Linear Autoencoders for Recommendation
Authors: Jaewan Moon, Hye-young Kim, and Jongwuk Lee
Abstract summary: This paper aims to theoretically understand the properties of two terms in linear autoencoders. We propose simple-yet-effective linear autoencoder models using diagonal inequality constraints, called Relaxed Linear AutoEncoder (RLAE) and Relaxed Denoising Linear AutoEncoder (RDLAE) Experimental results demonstrate that our models are comparable or superior to state-of-the-art linear and non-linear models on six benchmark datasets.
Score: 4.8802420827610025
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Linear autoencoder models learn an item-to-item weight matrix via convex optimization with L2 regularization and zero-diagonal constraints. Despite their simplicity, they have shown remarkable performance compared to sophisticated non-linear models. This paper aims to theoretically understand the properties of two terms in linear autoencoders. Through the lens of singular value decomposition (SVD) and principal component analysis (PCA), it is revealed that L2 regularization enhances the impact of high-ranked PCs. Meanwhile, zero-diagonal constraints reduce the impact of low-ranked PCs, leading to performance degradation for unpopular items. Inspired by this analysis, we propose simple-yet-effective linear autoencoder models using diagonal inequality constraints, called Relaxed Linear AutoEncoder (RLAE) and Relaxed Denoising Linear AutoEncoder (RDLAE). We prove that they generalize linear autoencoders by adjusting the degree of diagonal constraints. Experimental results demonstrate that our models are comparable or superior to state-of-the-art linear and non-linear models on six benchmark datasets; they significantly improve the accuracy of long-tail items. These results also support our theoretical insights on regularization and diagonal constraints in linear autoencoders.

Related papers

DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs [56.24431208419858]
We introduce underlinetextbfDirect Preference Learning with Only underlinetextbfSelf-Generated underlinetextbfTests and underlinetextbfCode (DSTC) DSTC uses only self-generated code snippets and tests to construct reliable preference pairs.
arXiv Detail & Related papers (2024-11-20T02:03:16Z)
Breaking the Low-Rank Dilemma of Linear Attention [61.55583836370135]
Linear attention provides a far more efficient solution by reducing the complexity to linear levels. Our experiments indicate that this performance drop is due to the low-rank nature of linear attention's feature map. We introduce Rank-Augmented Linear Attention (RALA), which rivals the performance of Softmax attention while maintaining linear complexity and high efficiency.
arXiv Detail & Related papers (2024-11-12T08:30:59Z)
Deep polytopic autoencoders for low-dimensional linear parameter-varying approximations and nonlinear feedback design [0.9187159782788578]
Polytopic autoencoders provide low-di-men-sion-al parametrizations of states in a polytope. For nonlinear PDEs, this is readily applied to low-dimensional linear parameter-varying (LPV) approximations.
arXiv Detail & Related papers (2024-03-26T18:57:56Z)
Implicit ZCA Whitening Effects of Linear Autoencoders for Recommendation [10.374400063702392]
We show a connection between a linear autoencoder model and ZCA whitening for recommendation data. We also show the correctness of applying a linear autoencoder to low-dimensional item vectors obtained using embedding methods.
arXiv Detail & Related papers (2023-08-15T07:58:22Z)
Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods [91.54785981649228]
This paper focuses on non-linear two-layer autoencoders trained in the challenging proportional regime. Our results characterize the minimizers of the population risk, and show that such minimizers are achieved by gradient methods. For the special case of a sign activation function, our analysis establishes the fundamental limits for the lossy compression of Gaussian sources via (shallow) autoencoders.
arXiv Detail & Related papers (2022-12-27T12:37:34Z)
Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences [52.6022911513076]
Transformer-based models are not efficient in processing long sequences due to the quadratic space and time complexity of the self-attention modules. We propose Linformer and Informer to reduce the quadratic complexity to linear (modulo logarithmic factors) via low-dimensional projection and row selection. Based on the theoretical analysis, we propose Skeinformer to accelerate self-attention and further improve the accuracy of matrix approximation to self-attention.
arXiv Detail & Related papers (2021-12-10T06:58:05Z)
On the Regularization of Autoencoders [14.46779433267854]
We show that the unsupervised setting by itself induces strong additional regularization, i.e., a severe reduction in the model-capacity of the learned autoencoder. We derive that a deep nonlinear autoencoder cannot fit the training data more accurately than a linear autoencoder does if both models have the same dimensionality in their last layer. We demonstrate that it is an accurate approximation across all model-ranks in our experiments on three well-known data sets.
arXiv Detail & Related papers (2021-10-21T18:28:25Z)
Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer. In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph. Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z)
LQF: Linear Quadratic Fine-Tuning [114.3840147070712]
We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning. LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification.
arXiv Detail & Related papers (2020-12-21T06:40:20Z)
An autoencoder-based reduced-order model for eigenvalue problems with application to neutron diffusion [0.0]
Using an autoencoder for dimensionality reduction, this paper presents a novel projection-based reduced-order model for eigenvalue problems. Reduced-order modelling relies on finding suitable basis functions which define a low-dimensional space in which a high-dimensional system is approximated.
arXiv Detail & Related papers (2020-08-15T16:52:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.