Related papers: Initialization Matters: Regularizing Manifold-informed Initialization for Neural Recommendation Systems

Initialization Matters: Regularizing Manifold-informed Initialization for Neural Recommendation Systems

URL: http://arxiv.org/abs/2106.04993v1
Date: Wed, 9 Jun 2021 11:26:18 GMT
Title: Initialization Matters: Regularizing Manifold-informed Initialization for Neural Recommendation Systems
Authors: Yinan Zhang, Boyang Li, Yong Liu, Hao Wang, Chunyan Miao
Abstract summary: We propose a new scheme for user embeddings called Laplacian Eigenmaps with Popularity-based Regularization for Isolated Data (LEPORID) LEPORID endows the embeddings with information regarding multi-scale neighborhood structures on the data manifold and performs adaptive regularization to compensate for high embedding variance on the tail of the data distribution. We show that existing neural systems with LEPORID often perform on par or better than KNN.
Score: 47.49065927541129
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Proper initialization is crucial to the optimization and the generalization of neural networks. However, most existing neural recommendation systems initialize the user and item embeddings randomly. In this work, we propose a new initialization scheme for user and item embeddings called Laplacian Eigenmaps with Popularity-based Regularization for Isolated Data (LEPORID). LEPORID endows the embeddings with information regarding multi-scale neighborhood structures on the data manifold and performs adaptive regularization to compensate for high embedding variance on the tail of the data distribution. Exploiting matrix sparsity, LEPORID embeddings can be computed efficiently. We evaluate LEPORID in a wide range of neural recommendation models. In contrast to the recent surprising finding that the simple K-nearest-neighbor (KNN) method often outperforms neural recommendation systems, we show that existing neural systems initialized with LEPORID often perform on par or better than KNN. To maximize the effects of the initialization, we propose the Dual-Loss Residual Recommendation (DLR2) network, which, when initialized with LEPORID, substantially outperforms both traditional and state-of-the-art neural recommender systems.

Related papers

Active Human Feedback Collection via Neural Contextual Dueling Bandits [84.7608942821423]
We propose Neural-ADB, an algorithm for collecting human preference feedback when the underlying latent reward function is non-linear. We show that when preference feedback follows the Bradley-Terry-Luce model, the worst sub-optimality gap of the policy learned by Neural-ADB decreases at a sub-linear rate as the preference dataset increases.
arXiv Detail & Related papers (2025-04-16T12:16:10Z)
Linear-Time Graph Neural Networks for Scalable Recommendations [50.45612795600707]
The key of recommender systems is to forecast users' future behaviors based on previous user-item interactions. Recent years have witnessed a rising interest in leveraging Graph Neural Networks (GNNs) to boost the prediction performance of recommender systems. We propose a Linear-Time Graph Neural Network (LTGNN) to scale up GNN-based recommender systems to achieve comparable scalability as classic MF approaches.
arXiv Detail & Related papers (2024-02-21T17:58:10Z)
Acceleration techniques for optimization over trained neural network ensembles [1.0323063834827415]
We study optimization problems where the objective function is modeled through feedforward neural networks with rectified linear unit activation. We present a mixed-integer linear program based on existing popular big-$M$ formulations for optimizing over a single neural network.
arXiv Detail & Related papers (2021-12-13T20:50:54Z)
Neuron Campaign for Initialization Guided by Information Bottleneck Theory [31.44355490646638]
Initialization plays a critical role in the training of deep neural networks (DNN) We use the Information Bottleneck (IB) theory to provide an explanation about the generalization of DNN. Experiments on MNIST dataset show that our method can lead to a better generalization performance with faster convergence.
arXiv Detail & Related papers (2021-08-14T13:19:43Z)
A novel Deep Neural Network architecture for non-linear system identification [78.69776924618505]
We present a novel Deep Neural Network (DNN) architecture for non-linear system identification. Inspired by fading memory systems, we introduce inductive bias (on the architecture) and regularization (on the loss function) This architecture allows for automatic complexity selection based solely on available data.
arXiv Detail & Related papers (2021-06-06T10:06:07Z)
Multi-Sample Online Learning for Spiking Neural Networks based on Generalized Expectation Maximization [42.125394498649015]
Spiking Neural Networks (SNNs) capture some of the efficiency of biological brains by processing through binary neural dynamic activations. This paper proposes to leverage multiple compartments that sample independent spiking signals while sharing synaptic weights. The key idea is to use these signals to obtain more accurate statistical estimates of the log-likelihood training criterion, as well as of its gradient.
arXiv Detail & Related papers (2021-02-05T16:39:42Z)
Neural Representations in Hybrid Recommender Systems: Prediction versus Regularization [8.384351067134999]
We define the neural representation for prediction (NRP) framework and apply it to the autoencoder-based recommendation systems. We also apply the NRP framework to a direct neural network structure which predicts the ratings without reconstructing the user and item information. The results confirm that neural representations are better for prediction than regularization and show that the NRP framework, combined with the direct neural network structure, outperforms the state-of-the-art methods in the prediction task.
arXiv Detail & Related papers (2020-10-12T23:12:49Z)
Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN) Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z)
Persistent Neurons [4.061135251278187]
We propose a trajectory-based strategy that optimize the learning task using information from previous solutions. Persistent neurons can be regarded as a method with gradient informed bias where individual updates are corrupted by deterministic error terms. We evaluate the full and partial persistent model and show it can be used to boost the performance on a range of NN structures.
arXiv Detail & Related papers (2020-07-02T22:36:49Z)
Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix. Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. The use of gradient combined nonvolutionity renders learning susceptible to novel problems. We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.