A Use Case: Reformulating Query Rewriting as a Statistical Machine
Translation Problem
- URL: http://arxiv.org/abs/2310.13031v1
- Date: Thu, 19 Oct 2023 11:37:14 GMT
- Title: A Use Case: Reformulating Query Rewriting as a Statistical Machine
Translation Problem
- Authors: Abdullah Can Algan, Emre Y\"urekli, Aykut \c{C}ay{\i}r
- Abstract summary: The paper proposes a query rewriting pipeline based on a monolingual machine translation model that learns to rewrite Arabic user search queries.
This paper also describes preprocessing steps to create a mapping between user queries and web page titles.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the most important challenges for modern search engines is to retrieve
relevant web content based on user queries. In order to achieve this challenge,
search engines have a module to rewrite user queries. That is why modern web
search engines utilize some statistical and neural models used in the natural
language processing domain. Statistical machine translation is a well-known NLP
method among them. The paper proposes a query rewriting pipeline based on a
monolingual machine translation model that learns to rewrite Arabic user search
queries. This paper also describes preprocessing steps to create a mapping
between user queries and web page titles.
Related papers
- Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models.
Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models.
Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z) - QueryBuilder: Human-in-the-Loop Query Development for Information Retrieval [12.543590253664492]
We present a novel, interactive system called $textitQueryBuilder$.
It allows a novice, English-speaking user to create queries with a small amount of effort.
It rapidly develops cross-lingual information retrieval queries corresponding to the user's information needs.
arXiv Detail & Related papers (2024-09-07T00:46:58Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - QTSumm: Query-Focused Summarization over Tabular Data [58.62152746690958]
People primarily consult tables to conduct data analysis or answer specific questions.
We define a new query-focused table summarization task, where text generation models have to perform human-like reasoning.
We introduce a new benchmark named QTSumm for this task, which contains 7,111 human-annotated query-summary pairs over 2,934 tables.
arXiv Detail & Related papers (2023-05-23T17:43:51Z) - Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline.
This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z) - Automated Query Generation for Evidence Collection from Web Search
Engines [2.642698101441705]
It is widely accepted that so-called facts can be checked by searching for information on the Internet.
This process requires a fact-checker to formulate a search query based on the fact and to present it to a search engine.
We ask the question as to whether it is possible to automate the first step, that of query generation.
arXiv Detail & Related papers (2023-03-15T14:32:00Z) - Context-Aware Query Rewriting for Improving Users' Search Experience on
E-commerce Websites [47.04727122209316]
E-commerce queries are often short and ambiguous.
Users tend to enter multiple searches, which we call context, before purchasing.
We propose an end-to-end context-aware query rewriting model.
arXiv Detail & Related papers (2022-09-15T19:46:01Z) - Study of Encoder-Decoder Architectures for Code-Mix Search Query
Translation [0.0]
Many of the queries we receive are code-mix, specifically Hinglish i.e. queries with one or more Hindi words written in English (Latin) script.
We propose a transformer-based approach for code-mix query translation to enable users to search with these queries.
The model is currently live on app and website, serving millions of queries.
arXiv Detail & Related papers (2022-08-07T12:59:50Z) - Query Rewriting via Cycle-Consistent Translation for E-Commerce Search [13.723266150864037]
We propose a novel deep neural network based approach to query rewriting.
We formulate query rewriting into a cyclic machine translation problem.
We introduce a novel cyclic consistent training algorithm in conjunction with state-of-the-art machine translation models.
arXiv Detail & Related papers (2021-03-01T06:47:12Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.