Turning Dross Into Gold Loss: is BERT4Rec really better than SASRec?
- URL: http://arxiv.org/abs/2309.07602v1
- Date: Thu, 14 Sep 2023 11:07:10 GMT
- Title: Turning Dross Into Gold Loss: is BERT4Rec really better than SASRec?
- Authors: Anton Klenitskiy, Alexey Vasilev
- Abstract summary: Two state-of-the-art baselines are Transformer-based models SASRec and BERT4Rec.
In most of the publications, BERT4Rec achieves better performance than SASRec.
We show that SASRec could be effectively trained with negative sampling and still outperform BERT4Rec, but the number of negative examples should be much larger than one.
- Score: 1.223779595809275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently sequential recommendations and next-item prediction task has become
increasingly popular in the field of recommender systems. Currently, two
state-of-the-art baselines are Transformer-based models SASRec and BERT4Rec.
Over the past few years, there have been quite a few publications comparing
these two algorithms and proposing new state-of-the-art models. In most of the
publications, BERT4Rec achieves better performance than SASRec. But BERT4Rec
uses cross-entropy over softmax for all items, while SASRec uses negative
sampling and calculates binary cross-entropy loss for one positive and one
negative item. In our work, we show that if both models are trained with the
same loss, which is used by BERT4Rec, then SASRec will significantly outperform
BERT4Rec both in terms of quality and training speed. In addition, we show that
SASRec could be effectively trained with negative sampling and still outperform
BERT4Rec, but the number of negative examples should be much larger than one.
Related papers
- Aligning GPTRec with Beyond-Accuracy Goals with Reinforcement Learning [67.71952251641545]
GPTRec is an alternative to the Top-K model for item-by-item recommendations.
We show that GPTRec offers a better tradeoff between accuracy and secondary metrics than classic greedy re-ranking techniques.
Our experiments on two datasets show that GPTRec's Next-K generation approach offers a better tradeoff between accuracy and secondary metrics than classic greedy re-ranking techniques.
arXiv Detail & Related papers (2024-03-07T19:47:48Z) - gSASRec: Reducing Overconfidence in Sequential Recommendation Trained
with Negative Sampling [67.71952251641545]
We show that models trained with negative sampling tend to overestimate the probabilities of positive interactions.
We propose a novel Generalised Binary Cross-Entropy Loss function (gBCE) and theoretically prove that it can mitigate overconfidence.
We show through detailed experiments on three datasets that gSASRec does not exhibit the overconfidence problem.
arXiv Detail & Related papers (2023-08-14T14:56:40Z) - Scaling Session-Based Transformer Recommendations using Optimized
Negative Sampling and Loss Functions [0.0]
TRON is a session-based Transformer Recommender using optimized negative-sampling.
TRON improves upon the recommendation quality of current methods while maintaining training speeds similar to SASRec.
A live A/B test yielded an 18.14% increase in click-through rate over SASRec.
arXiv Detail & Related papers (2023-07-27T14:47:38Z) - SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot
Neural Sparse Retrieval [92.27387459751309]
We provide SPRINT, a unified Python toolkit for evaluating neural sparse retrieval.
We establish strong and reproducible zero-shot sparse retrieval baselines across the well-acknowledged benchmark, BEIR.
We show that SPLADEv2 produces sparse representations with a majority of tokens outside of the original query and document.
arXiv Detail & Related papers (2023-07-19T22:48:02Z) - Improving Sequential Recommendation Models with an Enhanced Loss
Function [9.573139673704766]
We develop an improved loss function for sequential recommendation models.
We conduct experiments on two influential open-source libraries.
We reproduce the results of the BERT4Rec model on the Beauty dataset.
arXiv Detail & Related papers (2023-01-03T07:18:54Z) - A Systematic Review and Replicability Study of BERT4Rec for Sequential
Recommendation [91.02268704681124]
BERT4Rec is an effective model for sequential recommendation based on the Transformer architecture.
We show that BERT4Rec results are not consistent within these publications.
We propose our own implementation of BERT4Rec based on the Hugging Face Transformers library.
arXiv Detail & Related papers (2022-07-15T14:09:04Z) - Effective and Efficient Training for Sequential Recommendation using
Recency Sampling [91.02268704681124]
We propose a novel Recency-based Sampling of Sequences training objective.
We show that the models enhanced with our method can achieve performances exceeding or very close to stateof-the-art BERT4Rec.
arXiv Detail & Related papers (2022-07-06T13:06:31Z) - VSAC: Efficient and Accurate Estimator for H and F [68.65610177368617]
VSAC is a RANSAC-type robust estimator with a number of novelties.
It is significantly faster than all its predecessors and runs on average in 1-2 ms, on a CPU.
It is two orders of magnitude faster and yet as precise as MAGSAC++, the currently most accurate estimator of two-view geometry.
arXiv Detail & Related papers (2021-06-18T17:04:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.