Related papers: Sequential Recommendation via Stochastic Self-Attention

Sequential Recommendation via Stochastic Self-Attention

URL: http://arxiv.org/abs/2201.06035v1
Date: Sun, 16 Jan 2022 12:38:45 GMT
Title: Sequential Recommendation via Stochastic Self-Attention
Authors: Ziwei Fan, Zhiwei Liu, Yu Wang, Alice Wang, Zahra Nazari, Lei Zheng, Hao Peng, Philip S. Yu
Abstract summary: Transformer-based approaches embed items as vectors and use dot-product self-attention to measure the relationship between items. We propose a novel textbfSTOchastic textbfSelf-textbfAttention(STOSA) to overcome these issues. We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences.
Score: 68.52192964559829
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sequential recommendation models the dynamics of a user's previous behaviors in order to forecast the next item, and has drawn a lot of attention. Transformer-based approaches, which embed items as vectors and use dot-product self-attention to measure the relationship between items, demonstrate superior capabilities among existing sequential methods. However, users' real-world sequential behaviors are \textit{\textbf{uncertain}} rather than deterministic, posing a significant challenge to present techniques. We further suggest that dot-product-based approaches cannot fully capture \textit{\textbf{collaborative transitivity}}, which can be derived in item-item transitions inside sequences and is beneficial for cold start items. We further argue that BPR loss has no constraint on positive and sampled negative items, which misleads the optimization. We propose a novel \textbf{STO}chastic \textbf{S}elf-\textbf{A}ttention~(STOSA) to overcome these issues. STOSA, in particular, embeds each item as a stochastic Gaussian distribution, the covariance of which encodes the uncertainty. We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences, which effectively incorporates uncertainty into model training. Wasserstein attentions also enlighten the collaborative transitivity learning as it satisfies triangle inequality. Moreover, we introduce a novel regularization term to the ranking loss, which assures the dissimilarity between positive and the negative items. Extensive experiments on five real-world benchmark datasets demonstrate the superiority of the proposed model over state-of-the-art baselines, especially on cold start items. The code is available in \url{https://github.com/zfan20/STOSA}.

Related papers

Generative Recommendation with Continuous-Token Diffusion [11.23267167046234]
We propose a novel framework for large language model (LLM)-based recommender systems (RecSys) DeftRec incorporates textbfdenoising ditextbfffusion models to enable LLM-based RecSys to seamlessly support continuous textbftoken as input and target. Given a continuous token as output, recommendations can be easily generated through score-based retrieval.
arXiv Detail & Related papers (2025-04-16T12:01:03Z)
Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model [66.91323540178739]
Sequential recommendation (SR) aims to predict items that users may be interested in based on their historical behavior. We revisit SR from a novel information-theoretic perspective and find that sequential modeling methods fail to adequately capture randomness and unpredictability of user behavior. Inspired by fuzzy information processing theory, this paper introduces the fuzzy sets of interaction sequences to overcome the limitations and better capture the evolution of users' real interests.
arXiv Detail & Related papers (2024-10-31T14:52:01Z)
Diffusion-based Contrastive Learning for Sequential Recommendation [6.3482831836623355]
We propose a Context-aware Diffusion-based Contrastive Learning for Sequential Recommendation, named CaDiRec. CaDiRec employs a context-aware diffusion model to generate alternative items for the given positions within a sequence. We train the entire framework in an end-to-end manner, with shared item embeddings between the diffusion model and the recommendation model.
arXiv Detail & Related papers (2024-05-15T14:20:37Z)
Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples. We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z)
Rethinking Missing Data: Aleatoric Uncertainty-Aware Recommendation [59.500347564280204]
We propose a new Aleatoric Uncertainty-aware Recommendation (AUR) framework. AUR consists of a new uncertainty estimator along with a normal recommender model. As the chance of mislabeling reflects the potential of a pair, AUR makes recommendations according to the uncertainty.
arXiv Detail & Related papers (2022-09-22T04:32:51Z)
Contrastive Self-supervised Sequential Recommendation with Robust Augmentation [101.25762166231904]
Sequential Recommendationdescribes a set of techniques to model dynamic user behavior in order to predict future interactions in sequential user data. Old and new issues remain, including data-sparsity and noisy data. We propose Contrastive Self-Supervised Learning for sequential Recommendation (CoSeRec)
arXiv Detail & Related papers (2021-08-14T07:15:25Z)
Modeling Sequences as Distributions with Uncertainty for Sequential Recommendation [63.77513071533095]
Most existing sequential methods assume users are deterministic. Item-item transitions might fluctuate significantly in several item aspects and exhibit randomness of user interests. We propose a Distribution-based Transformer Sequential Recommendation (DT4SR) which injects uncertainties into sequential modeling.
arXiv Detail & Related papers (2021-06-11T04:35:21Z)
Variation Control and Evaluation for Generative SlateRecommendations [22.533997063750597]
We show that item perturbation can enforce slate variation and mitigate the over-concentration of generated slates. We also propose to separate a pivot selection phase from the generation process so that the model can apply perturbation before generation.
arXiv Detail & Related papers (2021-02-26T05:04:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.