Modeling Sequences as Distributions with Uncertainty for Sequential
Recommendation
- URL: http://arxiv.org/abs/2106.06165v1
- Date: Fri, 11 Jun 2021 04:35:21 GMT
- Title: Modeling Sequences as Distributions with Uncertainty for Sequential
Recommendation
- Authors: Ziwei Fan, Zhiwei Liu, Lei Zheng, Shen Wang, Philip S. Yu
- Abstract summary: Most existing sequential methods assume users are deterministic.
Item-item transitions might fluctuate significantly in several item aspects and exhibit randomness of user interests.
We propose a Distribution-based Transformer Sequential Recommendation (DT4SR) which injects uncertainties into sequential modeling.
- Score: 63.77513071533095
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The sequential patterns within the user interactions are pivotal for
representing the user's preference and capturing latent relationships among
items. The recent advancements of sequence modeling by Transformers advocate
the community to devise more effective encoders for the sequential
recommendation. Most existing sequential methods assume users are
deterministic. However, item-item transitions might fluctuate significantly in
several item aspects and exhibit randomness of user interests. This
\textit{stochastic characteristics} brings up a solid demand to include
uncertainties in representing sequences and items. Additionally, modeling
sequences and items with uncertainties expands users' and items' interaction
spaces, thus further alleviating cold-start problems.
In this work, we propose a Distribution-based Transformer for Sequential
Recommendation (DT4SR), which injects uncertainties into sequential modeling.
We use Elliptical Gaussian distributions to describe items and sequences with
uncertainty. We describe the uncertainty in items and sequences as Elliptical
Gaussian distribution. And we adopt Wasserstein distance to measure the
similarity between distributions. We devise two novel Trans-formers for
modeling mean and covariance, which guarantees the positive-definite property
of distributions. The proposed method significantly outperforms the
state-of-the-art methods. The experiments on three benchmark datasets also
demonstrate its effectiveness in alleviating cold-start issues. The code is
available inhttps://github.com/DyGRec/DT4SR.
Related papers
- Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model [66.91323540178739]
Sequential recommendation (SR) aims to predict items that users may be interested in based on their historical behavior.
We revisit SR from a novel information-theoretic perspective and find that sequential modeling methods fail to adequately capture randomness and unpredictability of user behavior.
Inspired by fuzzy information processing theory, this paper introduces the fuzzy sets of interaction sequences to overcome the limitations and better capture the evolution of users' real interests.
arXiv Detail & Related papers (2024-10-31T14:52:01Z) - Generative Diffusion Models for Sequential Recommendations [7.948486055890262]
Generative models such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) have shown promise in sequential recommendation tasks.
This research introduces enhancements to the DiffuRec architecture to improve robustness and incorporates a cross-attention mechanism in the Approximator to better capture relevant user-item interactions.
arXiv Detail & Related papers (2024-10-25T09:39:05Z) - Diffusion-based Contrastive Learning for Sequential Recommendation [6.3482831836623355]
We propose a Context-aware Diffusion-based Contrastive Learning for Sequential Recommendation, named CaDiRec.
CaDiRec employs a context-aware diffusion model to generate alternative items for the given positions within a sequence.
We train the entire framework in an end-to-end manner, with shared item embeddings between the diffusion model and the recommendation model.
arXiv Detail & Related papers (2024-05-15T14:20:37Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Sequential Recommendation via Stochastic Self-Attention [68.52192964559829]
Transformer-based approaches embed items as vectors and use dot-product self-attention to measure the relationship between items.
We propose a novel textbfSTOchastic textbfSelf-textbfAttention(STOSA) to overcome these issues.
We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences.
arXiv Detail & Related papers (2022-01-16T12:38:45Z) - Contrastive Self-supervised Sequential Recommendation with Robust
Augmentation [101.25762166231904]
Sequential Recommendationdescribes a set of techniques to model dynamic user behavior in order to predict future interactions in sequential user data.
Old and new issues remain, including data-sparsity and noisy data.
We propose Contrastive Self-Supervised Learning for sequential Recommendation (CoSeRec)
arXiv Detail & Related papers (2021-08-14T07:15:25Z) - Variation Control and Evaluation for Generative SlateRecommendations [22.533997063750597]
We show that item perturbation can enforce slate variation and mitigate the over-concentration of generated slates.
We also propose to separate a pivot selection phase from the generation process so that the model can apply perturbation before generation.
arXiv Detail & Related papers (2021-02-26T05:04:40Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.