Automated Query Reformulation for Efficient Search based on Query Logs
From Stack Overflow
- URL: http://arxiv.org/abs/2102.00826v1
- Date: Mon, 1 Feb 2021 13:31:50 GMT
- Title: Automated Query Reformulation for Efficient Search based on Query Logs
From Stack Overflow
- Authors: Kaibo Cao (1), Chunyang Chen (2), Sebastian Baltes (3), Christoph
Treude (3), Xiang Chen (4) ((1) Software Institute, Nanjing University,
China, (2) Faculty of Information Technology, Monash University, Australia,
(3) School of Computer Science, University of Adelaide, Australia, (4) School
of Information Science and Technology, Nantong University, China)
- Abstract summary: We propose an automated software-specific query reformulation approach based on deep learning.
We construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones.
Our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As a popular Q&A site for programming, Stack Overflow is a treasure for
developers. However, the amount of questions and answers on Stack Overflow make
it difficult for developers to efficiently locate the information they are
looking for. There are two gaps leading to poor search results: the gap between
the user's intention and the textual query, and the semantic gap between the
query and the post content. Therefore, developers have to constantly
reformulate their queries by correcting misspelled words, adding limitations to
certain programming languages or platforms, etc. As query reformulation is
tedious for developers, especially for novices, we propose an automated
software-specific query reformulation approach based on deep learning. With
query logs provided by Stack Overflow, we construct a large-scale query
reformulation corpus, including the original queries and corresponding
reformulated ones. Our approach trains a Transformer model that can
automatically generate candidate reformulated queries when given the user's
original query. The evaluation results show that our approach outperforms five
state-of-the-art baselines, and achieves a 5.6% to 33.5% boost in terms of
$\mathit{ExactMatch}$ and a 4.8% to 14.4% boost in terms of $\mathit{GLEU}$.
Related papers
- Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers [66.55612528039894]
AdaQR is a framework for training query rewriting models with limited rewrite annotations from seed datasets and completely no passage label.
A novel approach is proposed to assess retriever's preference for these candidates by the probability of answers conditioned on the conversational query.
arXiv Detail & Related papers (2024-06-16T16:09:05Z) - CLARINET: Augmenting Language Models to Ask Clarification Questions for Retrieval [52.134133938779776]
We present CLARINET, a system that asks informative clarification questions by choosing questions whose answers would maximize certainty in the correct candidate.
Our approach works by augmenting a large language model (LLM) to condition on a retrieval distribution, finetuning end-to-end to generate the question that would have maximized the rank of the true candidate at each turn.
arXiv Detail & Related papers (2024-04-28T18:21:31Z) - Answering Ambiguous Questions via Iterative Prompting [84.3426020642704]
In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist.
One approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity.
We present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions.
arXiv Detail & Related papers (2023-07-08T04:32:17Z) - Self-Supervised Query Reformulation for Code Search [6.415583252034772]
We propose SSQR, a self-supervised query reformulation method that does not rely on any parallel query corpus.
Inspired by pre-trained models, SSQR treats query reformulation as a masked language modeling task.
arXiv Detail & Related papers (2023-07-01T08:17:23Z) - ConvGQR: Generative Query Reformulation for Conversational Search [37.54018632257896]
ConvGQR is a new framework to reformulate conversational queries based on generative pre-trained language models.
We propose a knowledge infusion mechanism to optimize both query reformulation and retrieval.
arXiv Detail & Related papers (2023-05-25T01:45:06Z) - Answer ranking in Community Question Answering: a deep learning approach [0.0]
This work tries to advance the state of the art on answer ranking for community Question Answering by proceeding with a deep learning approach.
We created a large data set of questions and answers posted to the Stack Overflow website.
We leveraged the natural language processing capabilities of dense embeddings and LSTM networks to produce a prediction for the accepted answer attribute.
arXiv Detail & Related papers (2022-10-16T18:47:41Z) - Query Expansion and Entity Weighting for Query Reformulation Retrieval
in Voice Assistant Systems [6.590172620606211]
Voice assistants such as Alexa, Siri, and Google Assistant have become increasingly popular worldwide.
linguistic variations, variability of speech patterns, ambient acoustic conditions, and other such factors are often correlated with the assistants misinterpreting the user's query.
Retrieval based query reformulation (QR) systems are widely used to reformulate those misinterpreted user queries.
arXiv Detail & Related papers (2022-02-22T23:03:29Z) - A Systematic Review of Automated Query Reformulations in Source Code
Search [12.234169944475537]
We select 70 studies on query reformulations from 2,970 candidate studies.
We discuss the best practices and future opportunities to advance the state of research in search query reformulations.
arXiv Detail & Related papers (2021-08-22T05:47:10Z) - Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open
Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases.
query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z) - Brain-inspired Search Engine Assistant based on Knowledge Graph [53.89429854626489]
DeveloperBot is a brain-inspired search engine assistant named on knowledge graph.
It constructs a multi-layer query graph by splitting a complex multi-constraint query into several ordered constraints.
It then models the constraint reasoning process as subgraph search process inspired by the spreading activation model of cognitive science.
arXiv Detail & Related papers (2020-12-25T06:36:11Z) - Session-Aware Query Auto-completion using Extreme Multi-label Ranking [61.753713147852125]
We take the novel approach of modeling session-aware query auto-completion as an e Multi-Xtreme Ranking (XMR) problem.
We adapt a popular XMR algorithm for this purpose by proposing several modifications to the key steps in the algorithm.
Our approach meets the stringent latency requirements for auto-complete systems while leveraging session information in making suggestions.
arXiv Detail & Related papers (2020-12-09T17:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.