Semantic Ads Retrieval at Walmart eCommerce with Language Models Progressively Trained on Multiple Knowledge Domains
- URL: http://arxiv.org/abs/2502.09089v1
- Date: Thu, 13 Feb 2025 09:01:34 GMT
- Title: Semantic Ads Retrieval at Walmart eCommerce with Language Models Progressively Trained on Multiple Knowledge Domains
- Authors: Zhaodong Wang, Weizhi Du, Md Omar Faruk Rokon, Pooshpendu Adhikary, Yanbing Xue, Jiaxuan Xu, Jianghong Zhou, Kuang-chih Lee, Musen Wen,
- Abstract summary: We present an end-to-end solution tailored to optimize the ads retrieval system on Walmart.com.
Our approach is to pretrain the BERT-like classification model with product category information.
It enhances the search relevance metric by up to 16% compared to a baseline DSSM-based model.
- Score: 6.1008328784394
- License:
- Abstract: Sponsored search in e-commerce poses several unique and complex challenges. These challenges stem from factors such as the asymmetric language structure between search queries and product names, the inherent ambiguity in user search intent, and the vast volume of sparse and imbalanced search corpus data. The role of the retrieval component within a sponsored search system is pivotal, serving as the initial step that directly affects the subsequent ranking and bidding systems. In this paper, we present an end-to-end solution tailored to optimize the ads retrieval system on Walmart.com. Our approach is to pretrain the BERT-like classification model with product category information, enhancing the model's understanding of Walmart product semantics. Second, we design a two-tower Siamese Network structure for embedding structures to augment training efficiency. Third, we introduce a Human-in-the-loop Progressive Fusion Training method to ensure robust model performance. Our results demonstrate the effectiveness of this pipeline. It enhances the search relevance metric by up to 16% compared to a baseline DSSM-based model. Moreover, our large-scale online A/B testing demonstrates that our approach surpasses the ad revenue of the existing production model.
Related papers
- Multimodal semantic retrieval for product search [6.185573921868495]
We build a multimodal representation for product items in e-commerce search in contrast to pure-text representation of products.
We demonstrate that a multimodal representation scheme for a product can show improvement on purchase recall or relevance accuracy in semantic retrieval.
arXiv Detail & Related papers (2025-01-13T14:34:26Z) - ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
We propose a pioneering generAtive Cross-modal rEtrieval framework (ACE) for end-to-end cross-modal retrieval.
ACE achieves state-of-the-art performance in cross-modal retrieval and outperforms the strong baselines on Recall@1 by 15.27% on average.
arXiv Detail & Related papers (2024-06-25T12:47:04Z) - Large Language Models for Relevance Judgment in Product Search [48.56992980315751]
High relevance of retrieved and re-ranked items to the search query is the cornerstone of successful product search.
We present an array of techniques for leveraging Large Language Models (LLMs) for automating the relevance judgment of query-item pairs (QIPs) at scale.
Our findings have immediate implications for the growing field of relevance judgment automation in product search.
arXiv Detail & Related papers (2024-06-01T00:52:41Z) - Unified Embedding Based Personalized Retrieval in Etsy Search [0.206242362470764]
We propose learning a unified embedding model incorporating graph, transformer and term-based embeddings end to end.
Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate.
arXiv Detail & Related papers (2023-06-07T23:24:50Z) - Que2Engage: Embedding-based Retrieval for Relevant and Engaging Products
at Facebook Marketplace [15.054431410052851]
We present Que2Engage, a search EBR system built towards bridging the gap between retrieval and ranking for end-to-end optimizations.
We show the effectiveness of our approach via a multitask evaluation framework and thorough baseline comparisons and ablation studies.
arXiv Detail & Related papers (2023-02-21T23:10:16Z) - Multi-Objective Personalized Product Retrieval in Taobao Search [27.994166796745496]
We propose a novel Multi-Objective Personalized Product Retrieval (MOPPR) model with four hierarchical optimization objectives: relevance, exposure, click and purchase.
MOPPR achieves 0.96% transaction and 1.29% GMV improvements in a 28-day online A/B test.
Since the Double-11 shopping festival of 2021, MOPPR has been fully deployed in mobile Taobao search, replacing the previous MGDSPR.
arXiv Detail & Related papers (2022-10-09T05:18:42Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - Product1M: Towards Weakly Supervised Instance-Level Product Retrieval
via Cross-modal Pretraining [108.86502855439774]
We investigate a more realistic setting that aims to perform weakly-supervised multi-modal instance-level product retrieval.
We contribute Product1M, one of the largest multi-modal cosmetic datasets for real-world instance-level retrieval.
We propose a novel model named Cross-modal contrAstive Product Transformer for instance-level prodUct REtrieval (CAPTURE)
arXiv Detail & Related papers (2021-07-30T12:11:24Z) - Heterogeneous Network Embedding for Deep Semantic Relevance Match in
E-commerce Search [29.881612817309716]
We design an end-to-end First-and-Second-order Relevance prediction model for e-commerce item relevance.
We introduce external knowledge generated from BERT to refine the network of user behaviors.
Results of offline experiments showed that the new model significantly improved the prediction accuracy in terms of human relevance judgment.
arXiv Detail & Related papers (2021-01-13T03:12:53Z) - AutoRC: Improving BERT Based Relation Classification Models via
Architecture Search [50.349407334562045]
BERT based relation classification (RC) models have achieved significant improvements over the traditional deep learning models.
No consensus can be reached on what is the optimal architecture.
We design a comprehensive search space for BERT based RC models and employ neural architecture search (NAS) method to automatically discover the design choices.
arXiv Detail & Related papers (2020-09-22T16:55:49Z) - Automatic Validation of Textual Attribute Values in E-commerce Catalog
by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge.
It can learn transferable knowledge from a subset of categories with limited labeled data.
It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.