Modernizing Facebook Scoped Search: Keyword and Embedding Hybrid Retrieval with LLM Evaluation
- URL: http://arxiv.org/abs/2509.13603v1
- Date: Wed, 17 Sep 2025 00:22:08 GMT
- Title: Modernizing Facebook Scoped Search: Keyword and Embedding Hybrid Retrieval with LLM Evaluation
- Authors: Yongye Su, Zeya Zhang, Jane Kou, Cheng Ju, Shubhojeet Sarkar, Yamin Wang, Ji Liu, Shengbo Guo,
- Abstract summary: We introduce a framework of modernized Facebook Group Scoped Search.<n>We blend traditional keyword-based retrieval with embedding-based retrieval to improve the search relevance and diversity of search results.<n>Our results demonstrate that the blended retrieval system significantly enhances user engagement and search quality.
- Score: 4.821105793289088
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Beyond general web-scale search, social network search uniquely enables users to retrieve information and discover potential connections within their social context. We introduce a framework of modernized Facebook Group Scoped Search by blending traditional keyword-based retrieval with embedding-based retrieval (EBR) to improve the search relevance and diversity of search results. Our system integrates semantic retrieval into the existing keyword search pipeline, enabling users to discover more contextually relevant group posts. To rigorously assess the impact of this blended approach, we introduce a novel evaluation framework that leverages large language models (LLMs) to perform offline relevance assessments, providing scalable and consistent quality benchmarks. Our results demonstrate that the blended retrieval system significantly enhances user engagement and search quality, as validated by both online metrics and LLM-based evaluation. This work offers practical insights for deploying and evaluating advanced retrieval systems in large-scale, real-world social platforms.
Related papers
- Towards Context-aware Reasoning-enhanced Generative Searching in E-commerce [61.03081096959132]
We propose a context-aware reasoning-enhanced generative search framework for better textbfunderstanding the complicated context.<n>Our approach achieves superior performance compared with strong baselines, validating its effectiveness for search-based recommendation.
arXiv Detail & Related papers (2025-10-19T16:46:11Z) - Deep Learning-Based Approach for Improving Relational Aggregated Search [0.46664938579243564]
This research investigates the application of advanced natural language processing techniques, namely stacked autoencoders and AraBERT embeddings.<n>By transcending the limitations of traditional search engines, we offer more enriched, context-aware characterizations of search results.
arXiv Detail & Related papers (2025-10-01T14:37:38Z) - Reasoning-enhanced Query Understanding through Decomposition and Interpretation [87.56450566014625]
ReDI is a Reasoning-enhanced approach for query understanding through Decomposition and Interpretation.<n>We compiled a large-scale dataset of real-world complex queries from a major search engine.<n> Experiments on BRIGHT and BEIR demonstrate that ReDI consistently surpasses strong baselines in both sparse and dense retrieval paradigms.
arXiv Detail & Related papers (2025-09-08T10:58:42Z) - LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest [3.306725465028306]
We present our approach at Pinterest Search to automate relevance evaluation for online experiments using fine-tuned LLMs.<n>We rigorously validate the alignment between LLM-generated judgments and human annotations.<n>This approach leads to higher-quality relevance metrics and significantly reduces the Minimum Detectable Effect (MDE) in online experiment measurements.
arXiv Detail & Related papers (2025-09-03T23:07:49Z) - How good are LLMs at Retrieving Documents in a Specific Domain? [3.282961543904818]
We propose an automated method to curate a domain-specific evaluation dataset to analyze the capability of a search system.<n>We incorporate the Retrieval of Augmented Generation (RAG), powered by Large Language Models (LLMs), for high-quality retrieval of environmental domain data using natural language queries.
arXiv Detail & Related papers (2025-08-25T19:47:21Z) - LLM-Driven Usefulness Judgment for Web Search Evaluation [12.10711284043516]
Evaluation is fundamental in optimizing search experiences and supporting diverse user intents in Information Retrieval (IR)<n>Traditional search evaluation methods primarily rely on relevance labels, which assess how well retrieved documents match a user's query.<n>In this paper, we explore an alternative approach: LLM-generated usefulness labels, which incorporate both implicit and explicit user behavior signals to evaluate document usefulness.
arXiv Detail & Related papers (2025-04-19T20:38:09Z) - Hybrid Semantic Search: Unveiling User Intent Beyond Keywords [0.0]
This paper addresses the limitations of traditional keyword-based search in understanding user intent.
It introduces a novel hybrid search approach that leverages the strengths of non-semantic search engines, Large Language Models (LLMs), and embedding models.
arXiv Detail & Related papers (2024-08-17T16:04:31Z) - STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [93.96463520716759]
We develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Knowledge Bases.
Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine.
We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties.
arXiv Detail & Related papers (2024-04-19T22:54:54Z) - Evaluating Generative Ad Hoc Information Retrieval [58.800799175084286]
generative retrieval systems often directly return a grounded generated text as a response to a query.
Quantifying the utility of the textual responses is essential for appropriately evaluating such generative ad hoc retrieval.
arXiv Detail & Related papers (2023-11-08T14:05:00Z) - Large Search Model: Redefining Search Stack in the Era of LLMs [63.503320030117145]
We introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM)
All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts.
This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack.
arXiv Detail & Related papers (2023-10-23T05:52:09Z) - Rethinking the Evaluation for Conversational Recommendation in the Era
of Large Language Models [115.7508325840751]
The recent success of large language models (LLMs) has shown great potential to develop more powerful conversational recommender systems (CRSs)
In this paper, we embark on an investigation into the utilization of ChatGPT for conversational recommendation, revealing the inadequacy of the existing evaluation protocol.
We propose an interactive Evaluation approach based on LLMs named iEvaLM that harnesses LLM-based user simulators.
arXiv Detail & Related papers (2023-05-22T15:12:43Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.