Stochastic Sampling for Contrastive Views and Hard Negative Samples in Graph-based Collaborative Filtering
- URL: http://arxiv.org/abs/2405.00287v1
- Date: Wed, 1 May 2024 02:27:59 GMT
- Title: Stochastic Sampling for Contrastive Views and Hard Negative Samples in Graph-based Collaborative Filtering
- Authors: Chaejeong Lee, Jeongwhan Choi, Hyowon Wi, Sung-Bae Cho, Noseong Park,
- Abstract summary: Graph-based collaborative filtering (CF) has emerged as a promising approach in recommendation systems.
Despite its achievements, graph-based CF models face challenges due to data sparsity and negative sampling.
We propose a novel sampling for i) COntrastive views and ii) hard NEgative samples (SCONE) to overcome these issues.
- Score: 28.886714896469737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph-based collaborative filtering (CF) has emerged as a promising approach in recommendation systems. Despite its achievements, graph-based CF models face challenges due to data sparsity and negative sampling. In this paper, we propose a novel Stochastic sampling for i) COntrastive views and ii) hard NEgative samples (SCONE) to overcome these issues. By considering that they are both sampling tasks, we generate dynamic augmented views and diverse hard negative samples via our unified stochastic sampling framework based on score-based generative models. In our comprehensive evaluations with 6 benchmark datasets, our proposed SCONE significantly improves recommendation accuracy and robustness, and demonstrates the superiority of our approach over existing CF models. Furthermore, we prove the efficacy of user-item specific stochastic sampling for addressing the user sparsity and item popularity issues. The integration of the stochastic sampling and graph-based CF obtains the state-of-the-art in personalized recommendation systems, making significant strides in information-rich environments.
Related papers
- Large Language Model Enhanced Hard Sample Identification for Denoising Recommendation [4.297249011611168]
Implicit feedback is often used to build recommender systems.
Previous studies have attempted to alleviate this by identifying noisy samples based on their diverged patterns.
We propose a Large Language Model Enhanced Hard Sample Denoising framework.
arXiv Detail & Related papers (2024-09-16T14:57:09Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Deep Boosting Multi-Modal Ensemble Face Recognition with Sample-Level
Weighting [11.39204323420108]
Deep convolutional neural networks have achieved remarkable success in face recognition.
The current training benchmarks exhibit an imbalanced quality distribution.
This poses issues for generalization on hard samples since they are underrepresented during training.
Inspired by the well-known AdaBoost, we propose a sample-level weighting approach to incorporate the importance of different samples into the FR loss.
arXiv Detail & Related papers (2023-08-18T01:44:54Z) - Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis [50.972595036856035]
We present a code that successfully replicates results from six popular and recent graph recommendation models.
We compare these graph models with traditional collaborative filtering models that historically performed well in offline evaluations.
By investigating the information flow from users' neighborhoods, we aim to identify which models are influenced by intrinsic features in the dataset structure.
arXiv Detail & Related papers (2023-08-01T09:31:44Z) - Dimension Independent Mixup for Hard Negative Sample in Collaborative
Filtering [36.26865960551565]
Negative sampling plays a vital role in training CF-based models with implicit feedback.
We propose Dimension Independent Mixup for Hard Negative Sampling (DINS), which is the first Area-wise sampling method for training CF-based models.
Our work contributes a new perspective, introduces Area-wise sampling, and presents DINS as a novel approach for negative sampling.
arXiv Detail & Related papers (2023-06-28T04:03:31Z) - Moment Matching Denoising Gibbs Sampling [14.75945343063504]
Energy-Based Models (EBMs) offer a versatile framework for modeling complex data distributions.
The widely-used Denoising Score Matching (DSM) method for scalable EBM training suffers from inconsistency issues.
We propose an efficient sampling framework: (pseudo)-Gibbs sampling with moment matching.
arXiv Detail & Related papers (2023-05-19T12:58:25Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Towards Robust Visual Question Answering: Making the Most of Biased
Samples via Contrastive Learning [54.61762276179205]
We propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples.
Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples.
We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
arXiv Detail & Related papers (2022-10-10T11:05:21Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings.
We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z) - Sampler Design for Implicit Feedback Data by Noisy-label Robust Learning [32.76804332450971]
We design an adaptive sampler based on noisy-label robust learning for implicit feedback data.
We predict users' preferences with the model and learn it by maximizing likelihood of observed data labels.
We then consider the risk of these noisy labels, and propose a Noisy-label Robust BPO.
arXiv Detail & Related papers (2020-06-28T05:31:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.