Semantic Retrieval at Walmart
- URL: http://arxiv.org/abs/2412.04637v1
- Date: Thu, 05 Dec 2024 22:10:58 GMT
- Title: Semantic Retrieval at Walmart
- Authors: Alessandro Magnani, Feng Liu, Suthee Chaidaroon, Sachin Yadav, Praveen Reddy Suram, Ajit Puthenputhussery, Sijie Chen, Min Xie, Anirudh Kashi, Tony Lee, Ciya Liao,
- Abstract summary: We present a hybrid system for e-commerce search deployed at Walmart that combines traditional inverted index and embedding-based neural retrieval.
Our system significantly improved the relevance of the search engine, measured by both offline and online evaluations.
- Score: 41.15739523513742
- License:
- Abstract: In product search, the retrieval of candidate products before re-ranking is more critical and challenging than other search like web search, especially for tail queries, which have a complex and specific search intent. In this paper, we present a hybrid system for e-commerce search deployed at Walmart that combines traditional inverted index and embedding-based neural retrieval to better answer user tail queries. Our system significantly improved the relevance of the search engine, measured by both offline and online evaluations. The improvements were achieved through a combination of different approaches. We present a new technique to train the neural model at scale. and describe how the system was deployed in production with little impact on response time. We highlight multiple learnings and practical tricks that were used in the deployment of this system.
Related papers
- Semantic Ads Retrieval at Walmart eCommerce with Language Models Progressively Trained on Multiple Knowledge Domains [6.1008328784394]
We present an end-to-end solution tailored to optimize the ads retrieval system on Walmart.com.
Our approach is to pretrain the BERT-like classification model with product category information.
It enhances the search relevance metric by up to 16% compared to a baseline DSSM-based model.
arXiv Detail & Related papers (2025-02-13T09:01:34Z) - Exploring Query Understanding for Amazon Product Search [62.53282527112405]
We study how query understanding-based ranking features influence the ranking process.
We propose a query understanding-based multi-task learning framework for ranking.
We present our studies and investigations using the real-world system on Amazon Search.
arXiv Detail & Related papers (2024-08-05T03:33:11Z) - Unified Embedding Based Personalized Retrieval in Etsy Search [0.206242362470764]
We propose learning a unified embedding model incorporating graph, transformer and term-based embeddings end to end.
Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate.
arXiv Detail & Related papers (2023-06-07T23:24:50Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - Deep Reinforcement Agent for Efficient Instant Search [14.086339486783018]
We propose to address the load issue by identifying tokens that are semantically more salient towards retrieving relevant documents.
We train a reinforcement agent that interacts directly with the search engine and learns to predict the word's importance.
A novel evaluation framework is presented to study the trade-off between the number of triggered searches and the system's performance.
arXiv Detail & Related papers (2022-03-17T22:47:15Z) - Sequential Search with Off-Policy Reinforcement Learning [48.88165680363482]
We propose a highly scalable hybrid learning model that consists of an RNN learning framework and an attention model.
As a novel optimization step, we fit multiple short user sequences in a single RNN pass within a training batch, by solving a greedy knapsack problem on the fly.
We also explore the use of off-policy reinforcement learning in multi-session personalized search ranking.
arXiv Detail & Related papers (2022-02-01T06:52:40Z) - Exposing Query Identification for Search Transparency [69.06545074617685]
We explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems.
We derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.
arXiv Detail & Related papers (2021-10-14T20:19:27Z) - Zero-Shot Heterogeneous Transfer Learning from Recommender Systems to
Cold-Start Search Retrieval [30.95373255143698]
We propose a new Zero-Shot Heterogeneous Transfer Learning framework that transfers learned knowledge from the recommender system component to improve the search component of a content platform.
We conduct online and offline experiments on one of the world's largest search and recommender systems from Google, and present the results and lessons learned.
arXiv Detail & Related papers (2020-08-07T01:22:56Z) - Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search [83.22850633478302]
Retrosynthetic planning identifies a series of reactions that can lead to the synthesis of a target product.
Existing methods either require expensive return estimation by rollout with high variance, or optimize for search speed rather than the quality.
We propose Retro*, a neural-based A*-like algorithm that finds high-quality synthetic routes efficiently.
arXiv Detail & Related papers (2020-06-29T05:53:33Z) - Deep-n-Cheap: An Automated Search Framework for Low Complexity Deep
Learning [3.479254848034425]
We present Deep-n-Cheap -- an open-source AutoML framework to search for deep learning models.
Our framework is targeted for deployment on both benchmark and custom datasets.
Deep-n-Cheap includes a user-customizable complexity penalty which trades off performance with training time or number of parameters.
arXiv Detail & Related papers (2020-03-27T13:00:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.