Neural Collaborative Filtering vs. Matrix Factorization Revisited
- URL: http://arxiv.org/abs/2005.09683v2
- Date: Mon, 1 Jun 2020 23:21:33 GMT
- Title: Neural Collaborative Filtering vs. Matrix Factorization Revisited
- Authors: Steffen Rendle, Walid Krichene, Li Zhang, John Anderson
- Abstract summary: Embedding based models have been the state of the art in collaborative filtering for over a decade.
In recent years, it was suggested to replace the dot product with a learned similarity e.g. using a multilayer perceptron (MLP)
- Score: 20.237381375881228
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Embedding based models have been the state of the art in collaborative
filtering for over a decade. Traditionally, the dot product or higher order
equivalents have been used to combine two or more embeddings, e.g., most
notably in matrix factorization. In recent years, it was suggested to replace
the dot product with a learned similarity e.g. using a multilayer perceptron
(MLP). This approach is often referred to as neural collaborative filtering
(NCF). In this work, we revisit the experiments of the NCF paper that
popularized learned similarities using MLPs. First, we show that with a proper
hyperparameter selection, a simple dot product substantially outperforms the
proposed learned similarities. Second, while a MLP can in theory approximate
any function, we show that it is non-trivial to learn a dot product with an
MLP. Finally, we discuss practical issues that arise when applying MLP based
similarities and show that MLPs are too costly to use for item recommendation
in production environments while dot products allow to apply very efficient
retrieval algorithms. We conclude that MLPs should be used with care as
embedding combiner and that dot products might be a better default choice.
Related papers
- KAN or MLP: A Fairer Comparison [63.794304207664176]
This paper offers a fairer and more comprehensive comparison of KAN and models across various tasks.
We control the number of parameters and FLOPs to compare the performance of KAN and representation.
We find that KAN's issue is more severe than that of forgetting in a standard class-incremental continual learning setting.
arXiv Detail & Related papers (2024-07-23T17:43:35Z) - MLPs Learn In-Context on Regression and Classification Tasks [28.13046236900491]
In-context learning (ICL) is often assumed to be a unique hallmark of Transformer models.
We demonstrate that multi-layer perceptrons (MLPs) can also learn in-context.
arXiv Detail & Related papers (2024-05-24T15:04:36Z) - Tuning Pre-trained Model via Moment Probing [62.445281364055795]
We propose a novel Moment Probing (MP) method to explore the potential of LP.
MP performs a linear classification head based on the mean of final features.
Our MP significantly outperforms LP and is competitive with counterparts at less training cost.
arXiv Detail & Related papers (2023-07-21T04:15:02Z) - MLP Fusion: Towards Efficient Fine-tuning of Dense and Mixture-of-Experts Language Models [33.86069537521178]
Fine-tuning a pre-trained language model (PLM) emerges as the predominant strategy in many natural language processing applications.
General approaches (e.g. quantization and distillation) have been widely studied to reduce the compute/memory of PLM fine-tuning.
We propose one-shot compression techniques specifically designed for fine-tuning.
arXiv Detail & Related papers (2023-07-18T03:12:51Z) - Understanding MLP-Mixer as a Wide and Sparse MLP [7.734726150561087]
Multi-layer perceptron (MLP) is a fundamental component of deep learning.
Recent-based architectures, especially the Mixer-Mixer, have achieved significant empirical success.
We show that sparseness is a key mechanism underlying the Mixer-Mixers.
arXiv Detail & Related papers (2023-06-02T11:51:24Z) - MLP-AIR: An Efficient MLP-Based Method for Actor Interaction Relation
Learning in Group Activity Recognition [4.24515544235173]
Group Activity Recognition (GAR) aims to predict the activity category of the group by learning the actor-temporal interaction relation in the group.
Previous works mainly learn the interaction relation by the well-designed GCNs or Transformers.
In this paper, we design a novel-based method for Actor Interaction Relation learning (MLP-AIR) in GAR.
arXiv Detail & Related papers (2023-04-18T08:07:23Z) - MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing [123.43419144051703]
We present a novel-like 3D architecture for video recognition.
The results are comparable to state-of-the-art widely-used 3D CNNs and video.
arXiv Detail & Related papers (2022-06-13T16:21:33Z) - RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [113.1414517605892]
We propose a methodology, Locality Injection, to incorporate local priors into an FC layer.
RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
arXiv Detail & Related papers (2021-12-21T10:28:17Z) - MLP Architectures for Vision-and-Language Modeling: An Empirical Study [91.6393550858739]
We initiate the first empirical study on the use of architectures for vision-and-featured (VL) fusion.
We find that without pre-training, usings for multimodal fusion has a noticeable performance gap compared to transformers.
Instead of heavy multi-head attention, adding tiny one-head attention to encoders is sufficient to achieve comparable performance to transformers.
arXiv Detail & Related papers (2021-12-08T18:26:19Z) - CycleMLP: A MLP-like Architecture for Dense Prediction [26.74203747156439]
CycleMLP is a versatile backbone for visual recognition and dense predictions.
It can cope with various image sizes and achieves linear computational complexity to image size by using local windows.
CycleMLP aims to provide a competitive baseline on object detection, instance segmentation, and semantic segmentation for models.
arXiv Detail & Related papers (2021-07-21T17:23:06Z) - MLP-Mixer: An all-MLP Architecture for Vision [93.16118698071993]
We present-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs).
Mixer attains competitive scores on image classification benchmarks, with pre-training and inference comparable to state-of-the-art models.
arXiv Detail & Related papers (2021-05-04T16:17:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.