Related papers: Neural Collaborative Filtering vs. Matrix Factorization Revisited

Neural Collaborative Filtering vs. Matrix Factorization Revisited

URL: http://arxiv.org/abs/2005.09683v2
Date: Mon, 1 Jun 2020 23:21:33 GMT
Title: Neural Collaborative Filtering vs. Matrix Factorization Revisited
Authors: Steffen Rendle, Walid Krichene, Li Zhang, John Anderson
Abstract summary: Embedding based models have been the state of the art in collaborative filtering for over a decade. In recent years, it was suggested to replace the dot product with a learned similarity e.g. using a multilayer perceptron (MLP)
Score: 20.237381375881228
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Embedding based models have been the state of the art in collaborative filtering for over a decade. Traditionally, the dot product or higher order equivalents have been used to combine two or more embeddings, e.g., most notably in matrix factorization. In recent years, it was suggested to replace the dot product with a learned similarity e.g. using a multilayer perceptron (MLP). This approach is often referred to as neural collaborative filtering (NCF). In this work, we revisit the experiments of the NCF paper that popularized learned similarities using MLPs. First, we show that with a proper hyperparameter selection, a simple dot product substantially outperforms the proposed learned similarities. Second, while a MLP can in theory approximate any function, we show that it is non-trivial to learn a dot product with an MLP. Finally, we discuss practical issues that arise when applying MLP based similarities and show that MLPs are too costly to use for item recommendation in production environments while dot products allow to apply very efficient retrieval algorithms. We conclude that MLPs should be used with care as embedding combiner and that dot products might be a better default choice.

Related papers

KAN or MLP: A Fairer Comparison [63.794304207664176]
This paper offers a fairer and more comprehensive comparison of KAN and models across various tasks. We control the number of parameters and FLOPs to compare the performance of KAN and representation. We find that KAN's issue is more severe than that of forgetting in a standard class-incremental continual learning setting.
arXiv Detail & Related papers (2024-07-23T17:43:35Z)
MLPs Learn In-Context on Regression and Classification Tasks [28.13046236900491]
In-context learning (ICL) is often assumed to be a unique hallmark of Transformer models. We demonstrate that multi-layer perceptrons (MLPs) can also learn in-context.
arXiv Detail & Related papers (2024-05-24T15:04:36Z)
Tuning Pre-trained Model via Moment Probing [62.445281364055795]
We propose a novel Moment Probing (MP) method to explore the potential of LP. MP performs a linear classification head based on the mean of final features. Our MP significantly outperforms LP and is competitive with counterparts at less training cost.
arXiv Detail & Related papers (2023-07-21T04:15:02Z)
NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning [40.994306592119266]
Fine-tuning a pre-trained language model (PLM) emerges as the predominant strategy in many natural language processing applications. Some general approaches (e.g. quantization and distillation) have been widely studied to reduce the compute/memory of PLM fine-tuning. We propose to coin a lightweight PLM through NTK-approximating modules in fusion.
arXiv Detail & Related papers (2023-07-18T03:12:51Z)
Understanding MLP-Mixer as a Wide and Sparse MLP [7.734726150561087]
Multi-layer perceptron (MLP) is a fundamental component of deep learning. Recent-based architectures, especially the Mixer-Mixer, have achieved significant empirical success. We show that sparseness is a key mechanism underlying the Mixer-Mixers.
arXiv Detail & Related papers (2023-06-02T11:51:24Z)
MLP-AIR: An Efficient MLP-Based Method for Actor Interaction Relation Learning in Group Activity Recognition [4.24515544235173]
Group Activity Recognition (GAR) aims to predict the activity category of the group by learning the actor-temporal interaction relation in the group. Previous works mainly learn the interaction relation by the well-designed GCNs or Transformers. In this paper, we design a novel-based method for Actor Interaction Relation learning (MLP-AIR) in GAR.
arXiv Detail & Related papers (2023-04-18T08:07:23Z)
MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing [123.43419144051703]
We present a novel-like 3D architecture for video recognition. The results are comparable to state-of-the-art widely-used 3D CNNs and video.
arXiv Detail & Related papers (2022-06-13T16:21:33Z)
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [113.1414517605892]
We propose a methodology, Locality Injection, to incorporate local priors into an FC layer. RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
arXiv Detail & Related papers (2021-12-21T10:28:17Z)
MLP Architectures for Vision-and-Language Modeling: An Empirical Study [91.6393550858739]
We initiate the first empirical study on the use of architectures for vision-and-featured (VL) fusion. We find that without pre-training, usings for multimodal fusion has a noticeable performance gap compared to transformers. Instead of heavy multi-head attention, adding tiny one-head attention to encoders is sufficient to achieve comparable performance to transformers.
arXiv Detail & Related papers (2021-12-08T18:26:19Z)
CycleMLP: A MLP-like Architecture for Dense Prediction [26.74203747156439]
CycleMLP is a versatile backbone for visual recognition and dense predictions. It can cope with various image sizes and achieves linear computational complexity to image size by using local windows. CycleMLP aims to provide a competitive baseline on object detection, instance segmentation, and semantic segmentation for models.
arXiv Detail & Related papers (2021-07-21T17:23:06Z)
MLP-Mixer: An all-MLP Architecture for Vision [93.16118698071993]
We present-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs). Mixer attains competitive scores on image classification benchmarks, with pre-training and inference comparable to state-of-the-art models.
arXiv Detail & Related papers (2021-05-04T16:17:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.