APRF-Net: Attentive Pseudo-Relevance Feedback Network for Query
Categorization
- URL: http://arxiv.org/abs/2104.11384v1
- Date: Fri, 23 Apr 2021 02:34:08 GMT
- Title: APRF-Net: Attentive Pseudo-Relevance Feedback Network for Query
Categorization
- Authors: Ali Ahmadvand, Sayyed M. Zahiri, Simon Hughes, Khalifa Al Jadda, Surya
Kallumadi, and Eugene Agichtein
- Abstract summary: We propose a novel deep neural model named textbfAttentive textbfPseudo textbfRelevance textbfFeedback textbfNetwork (APRF-Net) to enhance the representation of rare queries for query categorization.
Our results show that the APRF-Net significantly improves query categorization by 5.9% on $F1@1$ score over the baselines, which increases to 8.2% improvement for the rare queries.
- Score: 12.634704014206294
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Query categorization is an essential part of query intent understanding in
e-commerce search. A common query categorization task is to select the relevant
fine-grained product categories in a product taxonomy. For frequent queries,
rich customer behavior (e.g., click-through data) can be used to infer the
relevant product categories. However, for more rare queries, which cover a
large volume of search traffic, relying solely on customer behavior may not
suffice due to the lack of this signal. To improve categorization of rare
queries, we adapt the Pseudo-Relevance Feedback (PRF) approach to utilize the
latent knowledge embedded in semantically or lexically similar product
documents to enrich the representation of the more rare queries. To this end,
we propose a novel deep neural model named \textbf{A}ttentive \textbf{P}seudo
\textbf{R}elevance \textbf{F}eedback \textbf{Net}work (APRF-Net) to enhance the
representation of rare queries for query categorization. To demonstrate the
effectiveness of our approach, we collect search queries from a large
commercial search engine, and compare APRF-Net to state-of-the-art deep
learning models for text classification. Our results show that the APRF-Net
significantly improves query categorization by 5.9\% on $F1@1$ score over the
baselines, which increases to 8.2\% improvement for the rare (tail) queries.
The findings of this paper can be leveraged for further improvements in search
query representation and understanding.
Related papers
- pEBR: A Probabilistic Approach to Embedding Based Retrieval [4.8338111302871525]
Embedding retrieval aims to learn a shared semantic representation space for both queries and items.
In current industrial practice, retrieval systems typically retrieve a fixed number of items for different queries.
arXiv Detail & Related papers (2024-10-25T07:14:12Z) - BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval [54.54576644403115]
Many complex real-world queries require in-depth reasoning to identify relevant documents.
We introduce BRIGHT, the first text retrieval benchmark that requires intensive reasoning to retrieve relevant documents.
Our dataset consists of 1,384 real-world queries spanning diverse domains, such as economics, psychology, mathematics, and coding.
arXiv Detail & Related papers (2024-07-16T17:58:27Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - User Intent Recognition and Semantic Cache Optimization-Based Query Processing Framework using CFLIS and MGR-LAU [0.0]
This work analyzed the informational, navigational, and transactional-based intents in queries for enhanced QP.
For efficient QP, the data is structured using Epanechnikov Kernel-Ordering Points To Identify the Clustering Structure (EK-OPTICS)
The extracted features, detected intents and structured data are inputted to the Multi-head Gated Recurrent Learnable Attention Unit (MGR-LAU)
arXiv Detail & Related papers (2024-06-06T20:28:05Z) - Improving Content Retrievability in Search with Controllable Query
Generation [5.450798147045502]
Machine-learned search engines have a high retrievability bias, where the majority of the queries return the same entities.
We propose CtrlQGen, a method that generates queries for a chosen underlying intent-narrow or broad.
Our results on datasets from the domains of music, podcasts, and books reveal that we can significantly decrease the retrievability bias of a dense retrieval model.
arXiv Detail & Related papers (2023-03-21T07:46:57Z) - CAPSTONE: Curriculum Sampling for Dense Retrieval with Document
Expansion [68.19934563919192]
We propose a curriculum sampling strategy that utilizes pseudo queries during training and progressively enhances the relevance between the generated query and the real query.
Experimental results on both in-domain and out-of-domain datasets demonstrate that our approach outperforms previous dense retrieval models.
arXiv Detail & Related papers (2022-12-18T15:57:46Z) - Query Expansion Using Contextual Clue Sampling with Language Models [69.51976926838232]
We propose a combination of an effective filtering strategy and fusion of the retrieved documents based on the generation probability of each context.
Our lexical matching based approach achieves a similar top-5/top-20 retrieval accuracy and higher top-100 accuracy compared with the well-established dense retrieval model DPR.
For end-to-end QA, the reader model also benefits from our method and achieves the highest Exact-Match score against several competitive baselines.
arXiv Detail & Related papers (2022-10-13T15:18:04Z) - Graph Enhanced BERT for Query Understanding [55.90334539898102]
query understanding plays a key role in exploring users' search intents and facilitating users to locate their most desired information.
In recent years, pre-trained language models (PLMs) have advanced various natural language processing tasks.
We propose a novel graph-enhanced pre-training framework, GE-BERT, which can leverage both query content and the query graph.
arXiv Detail & Related papers (2022-04-03T16:50:30Z) - Improving Query Representations for Dense Retrieval with Pseudo
Relevance Feedback [29.719150565643965]
This paper proposes ANCE-PRF, a new query encoder that uses pseudo relevance feedback (PRF) to improve query representations for dense retrieval.
ANCE-PRF uses a BERT encoder that consumes the query and the top retrieved documents from a dense retrieval model, ANCE, and it learns to produce better query embeddings directly from relevance labels.
Analysis shows that the PRF encoder effectively captures the relevant and complementary information from PRF documents, while ignoring the noise with its learned attention mechanism.
arXiv Detail & Related papers (2021-08-30T18:10:26Z) - DeepCAT: Deep Category Representation for Query Understanding in
E-commerce Search [15.041444067591007]
We propose a deep learning model, DeepCAT, which learns joint word-category representations to enhance the query understanding process.
Our results show that DeepCAT reaches a 10% improvement on em minority classes and a 7.1% improvement on em tail queries over a state-of-the-art label embedding model.
arXiv Detail & Related papers (2021-04-23T18:04:44Z) - Query Focused Multi-Document Summarization with Distant Supervision [88.39032981994535]
Existing work relies heavily on retrieval-style methods for estimating the relevance between queries and text segments.
We propose a coarse-to-fine modeling framework which introduces separate modules for estimating whether segments are relevant to the query.
We demonstrate that our framework outperforms strong comparison systems on standard QFS benchmarks.
arXiv Detail & Related papers (2020-04-06T22:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.