SCONE: A Novel Stochastic Sampling to Generate Contrastive Views and Hard Negative Samples for Recommendation
- URL: http://arxiv.org/abs/2405.00287v2
- Date: Thu, 19 Dec 2024 05:48:08 GMT
- Title: SCONE: A Novel Stochastic Sampling to Generate Contrastive Views and Hard Negative Samples for Recommendation
- Authors: Chaejeong Lee, Jeongwhan Choi, Hyowon Wi, Sung-Bae Cho, Noseong Park,
- Abstract summary: Graph-based collaborative filtering (CF) has emerged as a promising approach in recommender systems.<n>Despite its achievements, graph-based CF models face challenges due to data sparsity and negative sampling.<n>In this paper, we propose a novel sampling for i) COntrastive views and ii) hard NEgative samples (SCONE) to overcome these issues.
- Score: 28.886714896469737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph-based collaborative filtering (CF) has emerged as a promising approach in recommender systems. Despite its achievements, graph-based CF models face challenges due to data sparsity and negative sampling. In this paper, we propose a novel Stochastic sampling for i) COntrastive views and ii) hard NEgative samples (SCONE) to overcome these issues. SCONE generates dynamic augmented views and diverse hard negative samples via a unified stochastic sampling approach based on score-based generative models. Our extensive experiments on 6 benchmark datasets show that SCONE consistently outperforms state-of-the-art baselines. SCONE shows efficacy in addressing user sparsity and item popularity issues, while enhancing performance for both cold-start users and long-tail items. Furthermore, our approach improves the diversity of the recommendation and the uniformity of the representations. The code is available at https://github.com/jeongwhanchoi/SCONE.
Related papers
- Evaluating Performance and Bias of Negative Sampling in Large-Scale Sequential Recommendation Models [0.0]
Large-scale industrial recommendation models predict the most relevant items from catalogs containing millions or billions of options.
To train these models efficiently, a small set of irrelevant items (negative samples) is selected from the vast catalog for each relevant item.
Our study serves as a practical guide to the trade-offs in selecting a negative sampling method for large-scale sequential recommendation models.
arXiv Detail & Related papers (2024-10-08T00:23:17Z) - Large Language Model Enhanced Hard Sample Identification for Denoising Recommendation [4.297249011611168]
Implicit feedback is often used to build recommender systems.
Previous studies have attempted to alleviate this by identifying noisy samples based on their diverged patterns.
We propose a Large Language Model Enhanced Hard Sample Denoising framework.
arXiv Detail & Related papers (2024-09-16T14:57:09Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Deep Boosting Multi-Modal Ensemble Face Recognition with Sample-Level
Weighting [11.39204323420108]
Deep convolutional neural networks have achieved remarkable success in face recognition.
The current training benchmarks exhibit an imbalanced quality distribution.
This poses issues for generalization on hard samples since they are underrepresented during training.
Inspired by the well-known AdaBoost, we propose a sample-level weighting approach to incorporate the importance of different samples into the FR loss.
arXiv Detail & Related papers (2023-08-18T01:44:54Z) - Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis [50.972595036856035]
We present a code that successfully replicates results from six popular and recent graph recommendation models.
We compare these graph models with traditional collaborative filtering models that historically performed well in offline evaluations.
By investigating the information flow from users' neighborhoods, we aim to identify which models are influenced by intrinsic features in the dataset structure.
arXiv Detail & Related papers (2023-08-01T09:31:44Z) - Dimension Independent Mixup for Hard Negative Sample in Collaborative
Filtering [36.26865960551565]
Negative sampling plays a vital role in training CF-based models with implicit feedback.
We propose Dimension Independent Mixup for Hard Negative Sampling (DINS), which is the first Area-wise sampling method for training CF-based models.
Our work contributes a new perspective, introduces Area-wise sampling, and presents DINS as a novel approach for negative sampling.
arXiv Detail & Related papers (2023-06-28T04:03:31Z) - Neighborhood-based Hard Negative Mining for Sequential Recommendation [14.66576013324401]
Negative sampling plays a crucial role in training successful sequential recommendation models.
We propose a Graph-based Negative sampling approach based on Neighborhood Overlap (GNNO) to exploit structural information hidden in user behaviors for negative mining.
arXiv Detail & Related papers (2023-06-12T12:28:54Z) - Sample and Predict Your Latent: Modality-free Sequential Disentanglement
via Contrastive Estimation [2.7759072740347017]
We introduce a self-supervised sequential disentanglement framework based on contrastive estimation with no external signals.
In practice, we propose a unified, efficient, and easy-to-code sampling strategy for semantically similar and dissimilar views of the data.
Our method presents state-of-the-art results in comparison to existing techniques.
arXiv Detail & Related papers (2023-05-25T10:50:30Z) - Moment Matching Denoising Gibbs Sampling [14.75945343063504]
Energy-Based Models (EBMs) offer a versatile framework for modeling complex data distributions.
The widely-used Denoising Score Matching (DSM) method for scalable EBM training suffers from inconsistency issues.
We propose an efficient sampling framework: (pseudo)-Gibbs sampling with moment matching.
arXiv Detail & Related papers (2023-05-19T12:58:25Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Towards Robust Visual Question Answering: Making the Most of Biased
Samples via Contrastive Learning [54.61762276179205]
We propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples.
Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples.
We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
arXiv Detail & Related papers (2022-10-10T11:05:21Z) - Generating Negative Samples for Sequential Recommendation [83.60655196391855]
We propose to Generate Negative Samples (items) for Sequential Recommendation (SR)
A negative item is sampled at each time step based on the current SR model's learned user preferences toward items.
Experiments on four public datasets verify the importance of providing high-quality negative samples for SR.
arXiv Detail & Related papers (2022-08-07T05:44:13Z) - ReSmooth: Detecting and Utilizing OOD Samples when Training with Data
Augmentation [57.38418881020046]
Recent DA techniques always meet the need for diversity in augmented training samples.
An augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples.
We propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them.
arXiv Detail & Related papers (2022-05-25T09:29:27Z) - Rethinking Sampling Strategies for Unsupervised Person Re-identification [59.47536050785886]
We analyze the reasons for the performance differences between various sampling strategies under the same framework and loss function.
Group sampling is proposed, which gathers samples from the same class into groups.
Experiments on Market-1501, DukeMTMC-reID and MSMT17 show that group sampling achieves performance comparable to state-of-the-art methods.
arXiv Detail & Related papers (2021-07-07T05:39:58Z) - Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views.
Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs.
For the class view, we build the positive and negative pairs from the sample distribution of the class.
In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings.
We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z) - Multi-Scale Positive Sample Refinement for Few-Shot Object Detection [61.60255654558682]
Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances.
We propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD.
MPSR generates multi-scale positive samples as object pyramids and refines the prediction at various scales.
arXiv Detail & Related papers (2020-07-18T09:48:29Z) - Sampler Design for Implicit Feedback Data by Noisy-label Robust Learning [32.76804332450971]
We design an adaptive sampler based on noisy-label robust learning for implicit feedback data.
We predict users' preferences with the model and learn it by maximizing likelihood of observed data labels.
We then consider the risk of these noisy labels, and propose a Noisy-label Robust BPO.
arXiv Detail & Related papers (2020-06-28T05:31:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.