Event-enhanced Retrieval in Real-time Search
- URL: http://arxiv.org/abs/2404.05989v1
- Date: Tue, 9 Apr 2024 03:47:48 GMT
- Title: Event-enhanced Retrieval in Real-time Search
- Authors: Yanan Zhang, Xiaoling Bai, Tianhua Zhou,
- Abstract summary: Existing embedding-based retrieval models often face the "semantic drift" problem and insufficient focus on key information.
This paper proposes a novel approach called EER, which enhances real-time retrieval performance by improving the dual-encoder model.
We believe that this approach will provide new perspectives in the field of information retrieval.
- Score: 5.720930457681116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The embedding-based retrieval (EBR) approach is widely used in mainstream search engine retrieval systems and is crucial in recent retrieval-augmented methods for eliminating LLM illusions. However, existing EBR models often face the "semantic drift" problem and insufficient focus on key information, leading to a low adoption rate of retrieval results in subsequent steps. This issue is especially noticeable in real-time search scenarios, where the various expressions of popular events on the Internet make real-time retrieval heavily reliant on crucial event information. To tackle this problem, this paper proposes a novel approach called EER, which enhances real-time retrieval performance by improving the dual-encoder model of traditional EBR. We incorporate contrastive learning to accompany pairwise learning for encoder optimization. Furthermore, to strengthen the focus on critical event information in events, we include a decoder module after the document encoder, introduce a generative event triplet extraction scheme based on prompt-tuning, and correlate the events with query encoder optimization through comparative learning. This decoder module can be removed during inference. Extensive experiments demonstrate that EER can significantly improve the real-time search retrieval performance. We believe that this approach will provide new perspectives in the field of information retrieval. The codes and dataset are available at https://github.com/open-event-hub/Event-enhanced_Retrieval .
Related papers
- Improving Retrieval in Sponsored Search by Leveraging Query Context Signals [6.152499434499752]
We propose an approach to enhance query understanding by augmenting queries with rich contextual signals.
We use web search titles and snippets to ground queries in real-world information and utilize GPT-4 to generate query rewrites and explanations.
Our context-aware approach substantially outperforms context-free models.
arXiv Detail & Related papers (2024-07-19T14:28:53Z) - Event-driven Real-time Retrieval in Web Search [15.235255100530496]
This paper expands the query with event information that represents real-time search intent.
We further enhance the model's capacity for event representation through multi-task training.
Our proposed approach significantly outperforms existing state-of-the-art baseline methods.
arXiv Detail & Related papers (2023-12-01T06:30:31Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Zero-shot Audio Topic Reranking using Large Language Models [42.774019015099704]
Multimodal Video Search by Examples (MVSE) investigates using video clips as the query term for information retrieval.
This work aims to compensate for any performance loss from this rapid archive search by examining reranking approaches.
Performance is evaluated for topic-based retrieval on a publicly available video archive, the BBC Rewind corpus.
arXiv Detail & Related papers (2023-09-14T11:13:36Z) - Event-Centric Query Expansion in Web Search [12.341071896152174]
Event-Centric Query Expansion (EQE) is a novel QE system that mining the best expansion from a significant amount of potential events rapidly and accurately.
The system has been deployed in Tencent QQ Browser Search and served hundreds of millions of users.
arXiv Detail & Related papers (2023-05-30T13:19:53Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z) - Noise-Robust Dense Retrieval via Contrastive Alignment Post Training [89.29256833403167]
Contrastive Alignment POst Training (CAPOT) is a highly efficient finetuning method that improves model robustness without requiring index regeneration.
CAPOT enables robust retrieval by freezing the document encoder while the query encoder learns to align noisy queries with their unaltered root.
We evaluate CAPOT noisy variants of MSMARCO, Natural Questions, and Trivia QA passage retrieval, finding CAPOT has a similar impact as data augmentation with none of its overhead.
arXiv Detail & Related papers (2023-04-06T22:16:53Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - AntPivot: Livestream Highlight Detection via Hierarchical Attention
Mechanism [64.70568612993416]
We formulate a new task Livestream Highlight Detection, discuss and analyze the difficulties listed above and propose a novel architecture AntPivot to solve this problem.
We construct a fully-annotated dataset AntHighlight to instantiate this task and evaluate the performance of our model.
arXiv Detail & Related papers (2022-06-10T05:58:11Z) - Sequential Recommender via Time-aware Attentive Memory Network [67.26862011527986]
We propose a temporal gating methodology to improve attention mechanism and recurrent units.
We also propose a Multi-hop Time-aware Attentive Memory network to integrate long-term and short-term preferences.
Our approach is scalable for candidate retrieval tasks and can be viewed as a non-linear generalization of latent factorization for dot-product based Top-K recommendation.
arXiv Detail & Related papers (2020-05-18T11:29:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.