Extreme Multi-label Learning for Semantic Matching in Product Search
- URL: http://arxiv.org/abs/2106.12657v1
- Date: Wed, 23 Jun 2021 21:16:52 GMT
- Title: Extreme Multi-label Learning for Semantic Matching in Product Search
- Authors: Wei-Cheng Chang, Daniel Jiang, Hsiang-Fu Yu, Choon-Hui Teo, Jiong
Zhang, Kai Zhong, Kedarnath Kolluri, Qie Hu, Nikhil Shandilya, Vyacheslav
Ievgrafov, Japinder Singh, Inderjit S. Dhillon
- Abstract summary: Given a customer query, retrieve all semantically related products from a huge catalog of size 100 million, or more.
We consider hierarchical linear models with n-gram features for fast real-time inference.
Our method maintains a low latency of 1.25 milliseconds per query and achieves a 65% improvement of Recall@100.
- Score: 41.66238191444171
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of semantic matching in product search: given a
customer query, retrieve all semantically related products from a huge catalog
of size 100 million, or more. Because of large catalog spaces and real-time
latency constraints, semantic matching algorithms not only desire high recall
but also need to have low latency. Conventional lexical matching approaches
(e.g., Okapi-BM25) exploit inverted indices to achieve fast inference time, but
fail to capture behavioral signals between queries and products. In contrast,
embedding-based models learn semantic representations from customer behavior
data, but the performance is often limited by shallow neural encoders due to
latency constraints. Semantic product search can be viewed as an eXtreme
Multi-label Classification (XMC) problem, where customer queries are input
instances and products are output labels. In this paper, we aim to improve
semantic product search by using tree-based XMC models where inference time
complexity is logarithmic in the number of products. We consider hierarchical
linear models with n-gram features for fast real-time inference.
Quantitatively, our method maintains a low latency of 1.25 milliseconds per
query and achieves a 65% improvement of Recall@100 (60.9% v.s. 36.8%) over a
competing embedding-based DSSM model. Our model is robust to weight pruning
with varying thresholds, which can flexibly meet different system requirements
for online deployments. Qualitatively, our method can retrieve products that
are complementary to existing product search system and add diversity to the
match set.
Related papers
- Generative Retrieval and Alignment Model: A New Paradigm for E-commerce Retrieval [12.705202836685189]
This paper introduces a novel e-commerce retrieval paradigm: the Generative Retrieval and Alignment Model (GRAM)
GRAM employs joint training on text information from both queries and products to generate shared text codes.
GRAM significantly outperforms traditional models and the latest generative retrieval models.
arXiv Detail & Related papers (2025-04-02T06:40:09Z) - COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks.
We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges.
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z) - Retrieval with Learned Similarities [2.729516456192901]
State-of-the-art retrieval algorithms have migrated to learned similarities.
We show that Mixture-of-Logits (MoL) can be realized empirically to achieve superior performance on diverse retrieval scenarios.
arXiv Detail & Related papers (2024-07-22T08:19:34Z) - ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
We propose a pioneering generAtive Cross-modal rEtrieval framework (ACE) for end-to-end cross-modal retrieval.
ACE achieves state-of-the-art performance in cross-modal retrieval and outperforms the strong baselines on Recall@1 by 15.27% on average.
arXiv Detail & Related papers (2024-06-25T12:47:04Z) - When Box Meets Graph Neural Network in Tag-aware Recommendation [41.596515563108404]
We propose a novel algorithm, called BoxGNN, to perform the message aggregation via combination of logical operations.
We also adopt a volume-based learning objective with Gumbel smoothing techniques to refine the representation of boxes.
arXiv Detail & Related papers (2024-06-17T18:35:53Z) - Adaptive Retrieval and Scalable Indexing for k-NN Search with Cross-Encoders [77.84801537608651]
Cross-encoder (CE) models which compute similarity by jointly encoding a query-item pair perform better than embedding-based models (dual-encoders) at estimating query-item relevance.
We propose a sparse-matrix factorization based method that efficiently computes latent query and item embeddings to approximate CE scores and performs k-NN search with the approximate CE similarity.
arXiv Detail & Related papers (2024-05-06T17:14:34Z) - Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations [8.796275989527054]
We propose a novel organization of the inverted index that enables fast retrieval over learned sparse embeddings.
Our approach organizes inverted lists into geometrically-cohesive blocks, each equipped with a summary vector.
Our results indicate that Seismic is one to two orders of magnitude faster than state-of-the-art inverted index-based solutions.
arXiv Detail & Related papers (2024-04-29T15:49:27Z) - Improving Text Matching in E-Commerce Search with A Rationalizable,
Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM)
The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy.
We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z) - How Does Generative Retrieval Scale to Millions of Passages? [68.98628807288972]
We conduct the first empirical study of generative retrieval techniques across various corpus scales.
We scale generative retrieval to millions of passages with a corpus of 8.8M passages and evaluating model sizes up to 11B parameters.
While generative retrieval is competitive with state-of-the-art dual encoders on small corpora, scaling to millions of passages remains an important and unsolved challenge.
arXiv Detail & Related papers (2023-05-19T17:33:38Z) - Multi-Label Learning to Rank through Multi-Objective Optimization [9.099663022952496]
Learning to Rank technique is ubiquitous in the Information Retrieval system nowadays.
To resolve ambiguity, it is desirable to train a model using many relevance criteria.
We propose a general framework where the information from labels can be combined in a variety of ways to characterize the trade-off among the goals.
arXiv Detail & Related papers (2022-07-07T03:02:11Z) - Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme
Classification [43.840626501982314]
Extreme multi-label classification (XMC) aims to learn a model that can tag data points with a subset of relevant labels from an extremely large label set.
We propose an efficient information theory inspired algorithm to construct intermediary operating points that trade off between the benefits of both.
Our method can reduce a proxy for expected latency by up to 28% while maintaining the same accuracy as Parabel.
arXiv Detail & Related papers (2021-06-01T19:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.