Event-Centric Query Expansion in Web Search
- URL: http://arxiv.org/abs/2305.19019v1
- Date: Tue, 30 May 2023 13:19:53 GMT
- Title: Event-Centric Query Expansion in Web Search
- Authors: Yanan Zhang, Weijie Cui, Yangfan Zhang, Xiaoling Bai, Zhe Zhang, Jin
Ma, Xiang Chen, Tianhua Zhou
- Abstract summary: Event-Centric Query Expansion (EQE) is a novel QE system that mining the best expansion from a significant amount of potential events rapidly and accurately.
The system has been deployed in Tencent QQ Browser Search and served hundreds of millions of users.
- Score: 12.341071896152174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In search engines, query expansion (QE) is a crucial technique to improve
search experience. Previous studies often rely on long-term search log mining,
which leads to slow updates and is sub-optimal for time-sensitive news
searches. In this work, we present Event-Centric Query Expansion (EQE), a novel
QE system that addresses these issues by mining the best expansion from a
significant amount of potential events rapidly and accurately. This system
consists of four stages, i.e., event collection, event reformulation, semantic
retrieval and online ranking. Specifically, we first collect and filter news
headlines from websites. Then we propose a generation model that incorporates
contrastive learning and prompt-tuning techniques to reformulate these
headlines to concise candidates. Additionally, we fine-tune a dual-tower
semantic model to function as an encoder for event retrieval and explore a
two-stage contrastive training approach to enhance the accuracy of event
retrieval. Finally, we rank the retrieved events and select the optimal one as
QE, which is then used to improve the retrieval of event-related documents.
Through offline analysis and online A/B testing, we observe that the EQE system
significantly improves many metrics compared to the baseline. The system has
been deployed in Tencent QQ Browser Search and served hundreds of millions of
users. The dataset and baseline codes are available at
https://open-event-hub.github.io/eqe .
Related papers
- Query-oriented Data Augmentation for Session Search [71.84678750612754]
We propose query-oriented data augmentation to enrich search logs and empower the modeling.
We generate supplemental training pairs by altering the most important part of a search context.
We develop several strategies to alter the current query, resulting in new training data with varying degrees of difficulty.
arXiv Detail & Related papers (2024-07-04T08:08:33Z) - Event-enhanced Retrieval in Real-time Search [5.720930457681116]
Existing embedding-based retrieval models often face the "semantic drift" problem and insufficient focus on key information.
This paper proposes a novel approach called EER, which enhances real-time retrieval performance by improving the dual-encoder model.
We believe that this approach will provide new perspectives in the field of information retrieval.
arXiv Detail & Related papers (2024-04-09T03:47:48Z) - Event-driven Real-time Retrieval in Web Search [15.235255100530496]
This paper expands the query with event information that represents real-time search intent.
We further enhance the model's capacity for event representation through multi-task training.
Our proposed approach significantly outperforms existing state-of-the-art baseline methods.
arXiv Detail & Related papers (2023-12-01T06:30:31Z) - Improving Text Matching in E-Commerce Search with A Rationalizable,
Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM)
The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy.
We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z) - End-to-end Knowledge Retrieval with Multi-modal Queries [50.01264794081951]
ReMuQ requires a system to retrieve knowledge from a large corpus by integrating contents from both text and image queries.
We introduce a retriever model ReViz'' that can directly process input text and images to retrieve relevant knowledge in an end-to-end fashion.
We demonstrate superior performance in retrieval on two datasets under zero-shot settings.
arXiv Detail & Related papers (2023-06-01T08:04:12Z) - Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates.
To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item.
We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Exposing Query Identification for Search Transparency [69.06545074617685]
We explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems.
We derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.
arXiv Detail & Related papers (2021-10-14T20:19:27Z) - Event-Driven Query Expansion [23.08079115356717]
We propose a method to expand an event-related query by first detecting the events related to it.
We derive the candidates for expansion as terms semantically related to both the query and the events.
We show that our proposed method of leveraging events improves query expansion performance significantly compared with state-of-the-art methods on various newswire TREC datasets.
arXiv Detail & Related papers (2020-12-22T14:56:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.