Personalized Denoising Implicit Feedback for Robust Recommender System
- URL: http://arxiv.org/abs/2502.00348v1
- Date: Sat, 01 Feb 2025 07:13:06 GMT
- Title: Personalized Denoising Implicit Feedback for Robust Recommender System
- Authors: Kaike Zhang, Qi Cao, Yunfan Wu, Fei Sun, Huawei Shen, Xueqi Cheng,
- Abstract summary: We show that for a given user, there is a clear distinction between normal and noisy interactions in the user's personal loss distribution.
We propose a resampling strategy to Denoise using the user's Personal Loss distribution, named PLD, which reduces the probability of noisy interactions being optimized.
- Score: 60.719158008403376
- License:
- Abstract: While implicit feedback is foundational to modern recommender systems, factors such as human error, uncertainty, and ambiguity in user behavior inevitably introduce significant noise into this feedback, adversely affecting the accuracy and robustness of recommendations. To address this issue, existing methods typically aim to reduce the training weight of noisy feedback or discard it entirely, based on the observation that noisy interactions often exhibit higher losses in the overall loss distribution. However, we identify two key issues: (1) there is a significant overlap between normal and noisy interactions in the overall loss distribution, and (2) this overlap becomes even more pronounced when transitioning from pointwise loss functions (e.g., BCE loss) to pairwise loss functions (e.g., BPR loss). This overlap leads traditional methods to misclassify noisy interactions as normal, and vice versa. To tackle these challenges, we further investigate the loss overlap and find that for a given user, there is a clear distinction between normal and noisy interactions in the user's personal loss distribution. Based on this insight, we propose a resampling strategy to Denoise using the user's Personal Loss distribution, named PLD, which reduces the probability of noisy interactions being optimized. Specifically, during each optimization iteration, we create a candidate item pool for each user and resample the items from this pool based on the user's personal loss distribution, prioritizing normal interactions. Additionally, we conduct a theoretical analysis to validate PLD's effectiveness and suggest ways to further enhance its performance. Extensive experiments conducted on three datasets with varying noise ratios demonstrate PLD's efficacy and robustness.
Related papers
- Collaborative Diffusion Model for Recommender System [52.56609747408617]
We present a Collaborative Diffusion model for Recommender System (CDiff4Rec)
CDiff4Rec generates pseudo-users from item features and leverages collaborative signals from both real and pseudo personalized neighbors.
Experimental results on three public datasets show that CDiff4Rec outperforms competitors by effectively mitigating the loss of personalized information.
arXiv Detail & Related papers (2025-01-31T10:05:01Z) - DeBaTeR: Denoising Bipartite Temporal Graph for Recommendation [38.87538556340487]
We introduce a simple yet effective mechanism for generating time-aware user/item embeddings.
We propose two strategies for denoising bipartite temporal graph in recommender systems.
arXiv Detail & Related papers (2024-11-14T04:39:30Z) - Large Language Model Enhanced Hard Sample Identification for Denoising Recommendation [4.297249011611168]
Implicit feedback is often used to build recommender systems.
Previous studies have attempted to alleviate this by identifying noisy samples based on their diverged patterns.
We propose a Large Language Model Enhanced Hard Sample Denoising framework.
arXiv Detail & Related papers (2024-09-16T14:57:09Z) - A Huber Loss Minimization Approach to Mean Estimation under User-level Differential Privacy [32.38935276997549]
Privacy protection of users' entire contribution of samples is important in distributed systems.
We propose a Huber loss minimization approach to mean estimation under user-level differential privacy.
We provide a theoretical analysis of our approach, which gives the noise strength needed for privacy protection, as well as the bound of mean squared error.
arXiv Detail & Related papers (2024-05-22T08:46:45Z) - Double Correction Framework for Denoising Recommendation [45.98207284259792]
In implicit feedback, noisy samples can affect precise user preference learning.
A popular solution is based on dropping noisy samples in the model training phase.
We propose a Double Correction Framework for Denoising Recommendation.
arXiv Detail & Related papers (2024-05-18T12:15:10Z) - ROPO: Robust Preference Optimization for Large Language Models [59.10763211091664]
We propose an iterative alignment approach that integrates noise-tolerance and filtering of noisy samples without the aid of external models.
Experiments on three widely-used datasets with Mistral-7B and Llama-2-7B demonstrate that ROPO significantly outperforms existing preference alignment methods.
arXiv Detail & Related papers (2024-04-05T13:58:51Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Partial Identification with Noisy Covariates: A Robust Optimization
Approach [94.10051154390237]
Causal inference from observational datasets often relies on measuring and adjusting for covariates.
We show that this robust optimization approach can extend a wide range of causal adjustment methods to perform partial identification.
Across synthetic and real datasets, we find that this approach provides ATE bounds with a higher coverage probability than existing methods.
arXiv Detail & Related papers (2022-02-22T04:24:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.