MixRec: Individual and Collective Mixing Empowers Data Augmentation for Recommender Systems
- URL: http://arxiv.org/abs/2501.13579v2
- Date: Thu, 30 Jan 2025 00:29:20 GMT
- Title: MixRec: Individual and Collective Mixing Empowers Data Augmentation for Recommender Systems
- Authors: Yi Zhang, Yiwen Zhang,
- Abstract summary: We propose a novel Dual Mixing-based Recommendation Framework (MixRec) to empower data augmentation as we wish.
Experimental results on four real-world datasets demonstrate the advantages of MixRec in terms of effectiveness, simplicity, efficiency, and scalability.
- Score: 9.246794478152223
- License:
- Abstract: The core of the general recommender systems lies in learning high-quality embedding representations of users and items to investigate their positional relations in the feature space. Unfortunately, data sparsity caused by difficult-to-access interaction data severely limits the effectiveness of recommender systems. Faced with such a dilemma, various types of self-supervised learning methods have been introduced into recommender systems in an attempt to alleviate the data sparsity through distribution modeling or data augmentation. However, most data augmentation relies on elaborate manual design, which is not only not universal, but the bloated and redundant augmentation process may significantly slow down model training progress. To tackle these limitations, we propose a novel Dual Mixing-based Recommendation Framework (MixRec) to empower data augmentation as we wish. Specifically, we propose individual mixing and collective mixing, respectively. The former aims to provide a new positive sample that is unique to the target (user or item) and to make the pair-wise recommendation loss benefit from it, while the latter aims to portray a new sample that contains group properties in a batch. The two mentioned mixing mechanisms allow for data augmentation with only one parameter that does not need to be set multiple times and can be done in linear time complexity. Besides, we propose the dual-mixing contrastive learning to maximize the utilization of these new-constructed samples to enhance the consistency between pairs of positive samples. Experimental results on four real-world datasets demonstrate the advantages of MixRec in terms of effectiveness, simplicity, efficiency, and scalability.
Related papers
- A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts.
With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS)
Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements.
High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z) - MixRec: Heterogeneous Graph Collaborative Filtering [21.96510707666373]
We present a graph collaborative filtering model MixRec to disentangling users' multi-behavior interaction patterns.
Our model achieves this by incorporating intent disentanglement and multi-behavior modeling.
We also introduce a novel contrastive learning paradigm that adaptively explores the advantages of self-supervised data augmentation.
arXiv Detail & Related papers (2024-12-18T13:12:36Z) - ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups [17.672299431705262]
Using linear mixup alone, a data augmentation technique, for bias mitigation, can still retain biases in dataset labels.
We propose a novel pre-processing strategy in which both an existing mixup method and our new bias mitigation algorithm can be utilized.
ProxiMix keeps both pairwise and proximity relationships for fairer data augmentation.
arXiv Detail & Related papers (2024-10-02T00:47:03Z) - DimeRec: A Unified Framework for Enhanced Sequential Recommendation via Generative Diffusion Models [39.49215596285211]
Sequential Recommendation (SR) plays a pivotal role in recommender systems by tailoring recommendations to user preferences based on their non-stationary historical interactions.
We propose a novel framework called DimeRec that combines a guidance extraction module (GEM) and a generative diffusion aggregation module (DAM)
Our numerical experiments demonstrate that DimeRec significantly outperforms established baseline methods across three publicly available datasets.
arXiv Detail & Related papers (2024-08-22T06:42:09Z) - DoubleMix: Simple Interpolation-Based Data Augmentation for Text
Classification [56.817386699291305]
This paper proposes a simple yet effective data augmentation approach termed DoubleMix.
DoubleMix first generates several perturbed samples for each training data.
It then uses the perturbed data and original data to carry out a two-step in the hidden space of neural models.
arXiv Detail & Related papers (2022-09-12T15:01:04Z) - Harnessing Hard Mixed Samples with Decoupled Regularizer [69.98746081734441]
Mixup is an efficient data augmentation approach that improves the generalization of neural networks by smoothing the decision boundary with mixed data.
In this paper, we propose an efficient mixup objective function with a decoupled regularizer named Decoupled Mixup (DM)
DM can adaptively utilize hard mixed samples to mine discriminative features without losing the original smoothness of mixup.
arXiv Detail & Related papers (2022-03-21T07:12:18Z) - SURF: Semi-supervised Reward Learning with Data Augmentation for
Feedback-efficient Preference-based Reinforcement Learning [168.89470249446023]
We present SURF, a semi-supervised reward learning framework that utilizes a large amount of unlabeled samples with data augmentation.
In order to leverage unlabeled samples for reward learning, we infer pseudo-labels of the unlabeled samples based on the confidence of the preference predictor.
Our experiments demonstrate that our approach significantly improves the feedback-efficiency of the preference-based method on a variety of locomotion and robotic manipulation tasks.
arXiv Detail & Related papers (2022-03-18T16:50:38Z) - Mixing Up Contrastive Learning: Self-Supervised Representation Learning
for Time Series [22.376529167056376]
We propose an unsupervised contrastive learning framework motivated from the perspective of label smoothing.
The proposed approach uses a novel contrastive loss that naturally exploits a data augmentation scheme.
Experiments demonstrate the framework's superior performance compared to other representation learning approaches.
arXiv Detail & Related papers (2022-03-17T11:49:21Z) - Contrastive Self-supervised Sequential Recommendation with Robust
Augmentation [101.25762166231904]
Sequential Recommendationdescribes a set of techniques to model dynamic user behavior in order to predict future interactions in sequential user data.
Old and new issues remain, including data-sparsity and noisy data.
We propose Contrastive Self-Supervised Learning for sequential Recommendation (CoSeRec)
arXiv Detail & Related papers (2021-08-14T07:15:25Z) - Leveraging Historical Interaction Data for Improving Conversational
Recommender System [105.90963882850265]
We propose a novel pre-training approach to integrate item- and attribute-based preference sequence.
Experiment results on two real-world datasets have demonstrated the effectiveness of our approach.
arXiv Detail & Related papers (2020-08-19T03:43:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.