Mutual Wasserstein Discrepancy Minimization for Sequential
Recommendation
- URL: http://arxiv.org/abs/2301.12197v2
- Date: Tue, 20 Jun 2023 02:59:32 GMT
- Title: Mutual Wasserstein Discrepancy Minimization for Sequential
Recommendation
- Authors: Ziwei Fan, Zhiwei Liu, Hao Peng, Philip S Yu
- Abstract summary: We propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation.
We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement.
- Score: 82.0801585843835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised sequential recommendation significantly improves
recommendation performance by maximizing mutual information with well-designed
data augmentations. However, the mutual information estimation is based on the
calculation of Kullback Leibler divergence with several limitations, including
asymmetrical estimation, the exponential need of the sample size, and training
instability. Also, existing data augmentations are mostly stochastic and can
potentially break sequential correlations with random modifications. These two
issues motivate us to investigate an alternative robust mutual information
measurement capable of modeling uncertainty and alleviating KL divergence
limitations. To this end, we propose a novel self-supervised learning framework
based on Mutual WasserStein discrepancy minimization MStein for the sequential
recommendation. We propose the Wasserstein Discrepancy Measurement to measure
the mutual information between augmented sequences. Wasserstein Discrepancy
Measurement builds upon the 2-Wasserstein distance, which is more robust, more
efficient in small batch sizes, and able to model the uncertainty of stochastic
augmentation processes. We also propose a novel contrastive learning loss based
on Wasserstein Discrepancy Measurement. Extensive experiments on four benchmark
datasets demonstrate the effectiveness of MStein over baselines. More
quantitative analyses show the robustness against perturbations and training
efficiency in batch size. Finally, improvements analysis indicates better
representations of popular users or items with significant uncertainty. The
source code is at https://github.com/zfan20/MStein.
Related papers
- Linked shrinkage to improve estimation of interaction effects in
regression models [0.0]
We develop an estimator that adapts well to two-way interaction terms in a regression model.
We evaluate the potential of the model for inference, which is notoriously hard for selection strategies.
Our models can be very competitive to a more advanced machine learner, like random forest, even for fairly large sample sizes.
arXiv Detail & Related papers (2023-09-25T10:03:39Z) - Lower Bounds on the Bayesian Risk via Information Measures [17.698319441265223]
We show that one can lower bound the risk with any information measure by upper bounding its dual via Markov's inequality.
The behaviour of the lower bound in the number of samples is influenced by the choice of the information measure.
If the observations are subject to privatisation, stronger impossibility results can be obtained via Strong Data-Processing Inequalities.
arXiv Detail & Related papers (2023-03-22T12:09:12Z) - Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry [0.0]
We introduce a multi-fidelity estimator of covariance matrices that employs the log-Euclidean geometry of the symmetric positive-definite manifold.
We develop an optimal sample allocation scheme that minimizes the mean-squared error of the estimator given a fixed budget.
Evaluations of our approach using data from physical applications demonstrate more accurate metric learning and speedups of more than one order of magnitude compared to benchmarks.
arXiv Detail & Related papers (2023-01-31T16:33:46Z) - Convergence of uncertainty estimates in Ensemble and Bayesian sparse
model discovery [4.446017969073817]
We show empirical success in terms of accuracy and robustness to noise with bootstrapping-based sequential thresholding least-squares estimator.
We show that this bootstrapping-based ensembling technique can perform a provably correct variable selection procedure with an exponential convergence rate of the error rate.
arXiv Detail & Related papers (2023-01-30T04:07:59Z) - Composed Image Retrieval with Text Feedback via Multi-grained
Uncertainty Regularization [73.04187954213471]
We introduce a unified learning approach to simultaneously modeling the coarse- and fine-grained retrieval.
The proposed method has achieved +4.03%, +3.38%, and +2.40% Recall@50 accuracy over a strong baseline.
arXiv Detail & Related papers (2022-11-14T14:25:40Z) - A Statistical Analysis of Summarization Evaluation Metrics using
Resampling Methods [60.04142561088524]
We find that the confidence intervals are rather wide, demonstrating high uncertainty in how reliable automatic metrics truly are.
Although many metrics fail to show statistical improvements over ROUGE, two recent works, QAEval and BERTScore, do in some evaluation settings.
arXiv Detail & Related papers (2021-03-31T18:28:14Z) - Neural Methods for Point-wise Dependency Estimation [129.93860669802046]
We focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur.
We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.
arXiv Detail & Related papers (2020-06-09T23:26:15Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Disentangled Representation Learning with Wasserstein Total Correlation [90.44329632061076]
We introduce Wasserstein total correlation in both variational autoencoder and Wasserstein autoencoder settings to learn disentangled latent representations.
A critic is adversarially trained along with the main objective to estimate the Wasserstein total correlation term.
We show that the proposed approach has comparable performances on disentanglement with smaller sacrifices in reconstruction abilities.
arXiv Detail & Related papers (2019-12-30T05:31:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.