Efficient Neural Query Auto Completion
- URL: http://arxiv.org/abs/2008.02879v1
- Date: Thu, 6 Aug 2020 21:28:36 GMT
- Title: Efficient Neural Query Auto Completion
- Authors: Sida Wang, Weiwei Guo, Huiji Gao, Bo Long
- Abstract summary: Three major challenges are observed for a query auto completion system.
Traditional QAC systems rely on handcrafted features such as the query candidate frequency in search logs.
We propose an efficient neural QAC system with effective context modeling to overcome these challenges.
- Score: 17.58784759652327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Query Auto Completion (QAC), as the starting point of information retrieval
tasks, is critical to user experience. Generally it has two steps: generating
completed query candidates according to query prefixes, and ranking them based
on extracted features. Three major challenges are observed for a query auto
completion system: (1) QAC has a strict online latency requirement. For each
keystroke, results must be returned within tens of milliseconds, which poses a
significant challenge in designing sophisticated language models for it. (2)
For unseen queries, generated candidates are of poor quality as contextual
information is not fully utilized. (3) Traditional QAC systems heavily rely on
handcrafted features such as the query candidate frequency in search logs,
lacking sufficient semantic understanding of the candidate.
In this paper, we propose an efficient neural QAC system with effective
context modeling to overcome these challenges. On the candidate generation
side, this system uses as much information as possible in unseen prefixes to
generate relevant candidates, increasing the recall by a large margin. On the
candidate ranking side, an unnormalized language model is proposed, which
effectively captures deep semantics of queries. This approach presents better
ranking performance over state-of-the-art neural ranking methods and reduces
$\sim$95\% latency compared to neural language modeling methods. The empirical
results on public datasets show that our model achieves a good balance between
accuracy and efficiency. This system is served in LinkedIn job search with
significant product impact observed.
Related papers
- Large Language Models for Power Scheduling: A User-Centric Approach [6.335540414370735]
We introduce a novel architecture for resource scheduling problems by converting an arbitrary user's voice request (VRQ) into a resource allocation vector.
Specifically, we design an LLM intent recognition agent to translate the request into an optimization problem (OP), an LLM OP parameter identification agent, and an OP solving agent.
arXiv Detail & Related papers (2024-06-29T15:47:28Z) - CLARINET: Augmenting Language Models to Ask Clarification Questions for Retrieval [52.134133938779776]
We present CLARINET, a system that asks informative clarification questions by choosing questions whose answers would maximize certainty in the correct candidate.
Our approach works by augmenting a large language model (LLM) to condition on a retrieval distribution, finetuning end-to-end to generate the question that would have maximized the rank of the true candidate at each turn.
arXiv Detail & Related papers (2024-04-28T18:21:31Z) - Cache & Distil: Optimising API Calls to Large Language Models [82.32065572907125]
Large-scale deployment of generative AI tools often depends on costly API calls to a Large Language Model (LLM) to fulfil user queries.
To curtail the frequency of these calls, one can employ a smaller language model -- a student.
This student gradually gains proficiency in independently handling an increasing number of user requests.
arXiv Detail & Related papers (2023-10-20T15:01:55Z) - Improving Text Matching in E-Commerce Search with A Rationalizable,
Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM)
The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy.
We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z) - Large Language Models are Zero-Shot Rankers for Recommender Systems [76.02500186203929]
This work aims to investigate the capacity of large language models (LLMs) to act as the ranking model for recommender systems.
We show that LLMs have promising zero-shot ranking abilities but struggle to perceive the order of historical interactions.
We demonstrate that these issues can be alleviated using specially designed prompting and bootstrapping strategies.
arXiv Detail & Related papers (2023-05-15T17:57:39Z) - Learning to Retrieve Engaging Follow-Up Queries [12.380514998172199]
We present a retrieval based system and associated dataset for predicting the next questions that the user might have.
Such a system can proactively assist users in knowledge exploration leading to a more engaging dialog.
arXiv Detail & Related papers (2023-02-21T20:26:23Z) - On the Importance of Building High-quality Training Datasets for Neural
Code Search [15.557818317497397]
We propose a data cleaning framework consisting of two subsequent filters: a rule-based syntactic filter and a model-based semantic filter.
We evaluate the effectiveness of our framework on two widely-used code search models and three manually-annotated code retrieval benchmarks.
arXiv Detail & Related papers (2022-02-14T12:02:41Z) - Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To
Benchmark [14.50261153230204]
We focus on Multimodal Machine Reading (M3C) where a model is expected to answer questions based on given passage (or context)
We identify three critical biases stemming from the question-answer generation process and memorization capabilities of large deep models.
We propose a systematic framework to address these biases through three Control-Knobs.
arXiv Detail & Related papers (2021-10-22T16:33:57Z) - Session-Aware Query Auto-completion using Extreme Multi-label Ranking [61.753713147852125]
We take the novel approach of modeling session-aware query auto-completion as an e Multi-Xtreme Ranking (XMR) problem.
We adapt a popular XMR algorithm for this purpose by proposing several modifications to the key steps in the algorithm.
Our approach meets the stringent latency requirements for auto-complete systems while leveraging session information in making suggestions.
arXiv Detail & Related papers (2020-12-09T17:56:22Z) - A Clarifying Question Selection System from NTES_ALONG in Convai3
Challenge [8.656503175492375]
This paper presents the participation of NetEase Game AI Lab team for the ClariQ challenge at Search-oriented Conversational AI (SCAI) EMNLP workshop in 2020.
The challenge asks for a complete conversational information retrieval system that can understanding and generating clarification questions.
We propose a clarifying question selection system which consists of response understanding, candidate question recalling and clarifying question ranking.
arXiv Detail & Related papers (2020-10-27T11:22:53Z) - A Study on Efficiency, Accuracy and Document Structure for Answer
Sentence Selection [112.0514737686492]
In this paper, we argue that by exploiting the intrinsic structure of the original rank together with an effective word-relatedness encoder, we can achieve competitive results.
Our model takes 9.5 seconds to train on the WikiQA dataset, i.e., very fast in comparison with the $sim 18$ minutes required by a standard BERT-base fine-tuning.
arXiv Detail & Related papers (2020-03-04T22:12:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.