Mitigating Pooling Bias in E-commerce Search via False Negative Estimation
- URL: http://arxiv.org/abs/2311.06444v3
- Date: Sat, 17 Aug 2024 15:04:58 GMT
- Title: Mitigating Pooling Bias in E-commerce Search via False Negative Estimation
- Authors: Xiaochen Wang, Xiao Xiao, Ruhan Zhang, Xuan Zhang, Taesik Na, Tejaswi Tenneti, Haixun Wang, Fenglong Ma,
- Abstract summary: Bias-mitigating Hard Negative Sampling is a novel negative sampling strategy tailored to identify and adjust for false negatives.
Our experiments in the search setting confirm BHNS as effective for practical e-commerce use.
- Score: 25.40402675846542
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Efficient and accurate product relevance assessment is critical for user experiences and business success. Training a proficient relevance assessment model requires high-quality query-product pairs, often obtained through negative sampling strategies. Unfortunately, current methods introduce pooling bias by mistakenly sampling false negatives, diminishing performance and business impact. To address this, we present Bias-mitigating Hard Negative Sampling (BHNS), a novel negative sampling strategy tailored to identify and adjust for false negatives, building upon our original False Negative Estimation algorithm. Our experiments in the Instacart search setting confirm BHNS as effective for practical e-commerce use. Furthermore, comparative analyses on public dataset showcase its domain-agnostic potential for diverse applications.
Related papers
- Fighting Sampling Bias: A Framework for Training and Evaluating Credit Scoring Models [2.918530881730374]
This paper addresses the adverse effect of sampling bias on model training and evaluation.
We propose bias-aware self-learning and a reject inference framework for scorecard evaluation.
Our results suggest a profit improvement of about eight percent, when using Bayesian evaluation to decide on acceptance rates.
arXiv Detail & Related papers (2024-07-17T20:59:54Z) - Your Negative May not Be True Negative: Boosting Image-Text Matching
with False Negative Elimination [62.18768931714238]
We propose a novel False Negative Elimination (FNE) strategy to select negatives via sampling.
The results demonstrate the superiority of our proposed false negative elimination strategy.
arXiv Detail & Related papers (2023-08-08T16:31:43Z) - Test-Time Distribution Normalization for Contrastively Learned
Vision-language Models [39.66329310098645]
One of the most representative approaches proposed recently known as CLIP has garnered widespread adoption due to its effectiveness.
This paper reveals that the common downstream practice of taking a dot product is only a zeroth-order approximation of the optimization goal, resulting in a loss of information during test-time.
We propose Distribution Normalization (DN), where we approximate the mean representation of a batch of test samples and use such a mean to represent what would be analogous to negative samples in the InfoNCE loss.
arXiv Detail & Related papers (2023-02-22T01:14:30Z) - Generating Negative Samples for Sequential Recommendation [83.60655196391855]
We propose to Generate Negative Samples (items) for Sequential Recommendation (SR)
A negative item is sampled at each time step based on the current SR model's learned user preferences toward items.
Experiments on four public datasets verify the importance of providing high-quality negative samples for SR.
arXiv Detail & Related papers (2022-08-07T05:44:13Z) - Asymptotically Unbiased Estimation for Delayed Feedback Modeling via
Label Correction [14.462884375151045]
Delayed feedback is crucial for the conversion rate prediction in online advertising.
Previous delayed feedback modeling methods balance the trade-off between waiting for accurate labels and consuming fresh feedback.
We propose a new method, DElayed Feedback modeling with UnbiaSed Estimation, (DEFUSE), which aim to respectively correct the importance weights of the immediate positive, the fake negative, the real negative, and the delay positive samples.
arXiv Detail & Related papers (2022-02-14T03:31:09Z) - Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework.
We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z) - Simplify and Robustify Negative Sampling for Implicit Collaborative
Filtering [42.832851785261894]
In this paper, we first provide a novel understanding of negative instances by empirically observing that only a few instances are potentially important for model learning.
We then tackle the untouched false negative problem by favouring high-variance samples stored in memory.
Empirical results on two synthetic datasets and three real-world datasets demonstrate both robustness and superiorities of our negative sampling method.
arXiv Detail & Related papers (2020-09-07T19:08:26Z) - Understanding Negative Sampling in Graph Representation Learning [87.35038268508414]
We show that negative sampling is as important as positive sampling in determining the optimization objective and the resulted variance.
We propose Metropolis-Hastings (MCNS) to approximate the positive distribution with self-contrast approximation and accelerate negative sampling by Metropolis-Hastings.
We evaluate our method on 5 datasets that cover extensive downstream graph learning tasks, including link prediction, node classification and personalized recommendation.
arXiv Detail & Related papers (2020-05-20T06:25:21Z) - Reinforced Negative Sampling over Knowledge Graph for Recommendation [106.07209348727564]
We develop a new negative sampling model, Knowledge Graph Policy Network (kgPolicy), which works as a reinforcement learning agent to explore high-quality negatives.
kgPolicy navigates from the target positive interaction, adaptively receives knowledge-aware negative signals, and ultimately yields a potential negative item to train the recommender.
arXiv Detail & Related papers (2020-03-12T12:44:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.