Sequential Recommendation via Stochastic Self-Attention
- URL: http://arxiv.org/abs/2201.06035v1
- Date: Sun, 16 Jan 2022 12:38:45 GMT
- Title: Sequential Recommendation via Stochastic Self-Attention
- Authors: Ziwei Fan, Zhiwei Liu, Yu Wang, Alice Wang, Zahra Nazari, Lei Zheng,
Hao Peng, Philip S. Yu
- Abstract summary: Transformer-based approaches embed items as vectors and use dot-product self-attention to measure the relationship between items.
We propose a novel textbfSTOchastic textbfSelf-textbfAttention(STOSA) to overcome these issues.
We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences.
- Score: 68.52192964559829
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sequential recommendation models the dynamics of a user's previous behaviors
in order to forecast the next item, and has drawn a lot of attention.
Transformer-based approaches, which embed items as vectors and use dot-product
self-attention to measure the relationship between items, demonstrate superior
capabilities among existing sequential methods. However, users' real-world
sequential behaviors are \textit{\textbf{uncertain}} rather than deterministic,
posing a significant challenge to present techniques. We further suggest that
dot-product-based approaches cannot fully capture \textit{\textbf{collaborative
transitivity}}, which can be derived in item-item transitions inside sequences
and is beneficial for cold start items. We further argue that BPR loss has no
constraint on positive and sampled negative items, which misleads the
optimization. We propose a novel \textbf{STO}chastic
\textbf{S}elf-\textbf{A}ttention~(STOSA) to overcome these issues. STOSA, in
particular, embeds each item as a stochastic Gaussian distribution, the
covariance of which encodes the uncertainty. We devise a novel Wasserstein
Self-Attention module to characterize item-item position-wise relationships in
sequences, which effectively incorporates uncertainty into model training.
Wasserstein attentions also enlighten the collaborative transitivity learning
as it satisfies triangle inequality. Moreover, we introduce a novel
regularization term to the ranking loss, which assures the dissimilarity
between positive and the negative items. Extensive experiments on five
real-world benchmark datasets demonstrate the superiority of the proposed model
over state-of-the-art baselines, especially on cold start items. The code is
available in \url{https://github.com/zfan20/STOSA}.
Related papers
- Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model [66.91323540178739]
Sequential recommendation (SR) aims to predict items that users may be interested in based on their historical behavior.
We revisit SR from a novel information-theoretic perspective and find that sequential modeling methods fail to adequately capture randomness and unpredictability of user behavior.
Inspired by fuzzy information processing theory, this paper introduces the fuzzy sets of interaction sequences to overcome the limitations and better capture the evolution of users' real interests.
arXiv Detail & Related papers (2024-10-31T14:52:01Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Rethinking Missing Data: Aleatoric Uncertainty-Aware Recommendation [59.500347564280204]
We propose a new Aleatoric Uncertainty-aware Recommendation (AUR) framework.
AUR consists of a new uncertainty estimator along with a normal recommender model.
As the chance of mislabeling reflects the potential of a pair, AUR makes recommendations according to the uncertainty.
arXiv Detail & Related papers (2022-09-22T04:32:51Z) - Contrastive Self-supervised Sequential Recommendation with Robust
Augmentation [101.25762166231904]
Sequential Recommendationdescribes a set of techniques to model dynamic user behavior in order to predict future interactions in sequential user data.
Old and new issues remain, including data-sparsity and noisy data.
We propose Contrastive Self-Supervised Learning for sequential Recommendation (CoSeRec)
arXiv Detail & Related papers (2021-08-14T07:15:25Z) - Modeling Sequences as Distributions with Uncertainty for Sequential
Recommendation [63.77513071533095]
Most existing sequential methods assume users are deterministic.
Item-item transitions might fluctuate significantly in several item aspects and exhibit randomness of user interests.
We propose a Distribution-based Transformer Sequential Recommendation (DT4SR) which injects uncertainties into sequential modeling.
arXiv Detail & Related papers (2021-06-11T04:35:21Z) - Variation Control and Evaluation for Generative SlateRecommendations [22.533997063750597]
We show that item perturbation can enforce slate variation and mitigate the over-concentration of generated slates.
We also propose to separate a pivot selection phase from the generation process so that the model can apply perturbation before generation.
arXiv Detail & Related papers (2021-02-26T05:04:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.