Controlling Output Rankings in Generative Engines for LLM-based Search
- URL: http://arxiv.org/abs/2602.03608v1
- Date: Tue, 03 Feb 2026 14:59:48 GMT
- Title: Controlling Output Rankings in Generative Engines for LLM-based Search
- Authors: Haibo Jin, Ruoxi Chen, Peiyan Zhang, Yifeng Luo, Huimin Zeng, Man Luo, Haohan Wang,
- Abstract summary: CORE is an optimization method that textbfControls textbfOutput textbfRankings in gtextbfEnerative Engines.<n> CORE targets the content returned by search engines as the primary means of influencing output rankings.<n> CORE achieves an average Promotion Success Rate of textbf91.4% @Top-5, textbf86.6% @Top-3, and textbf80.3% @Top-1, across 15 product categories.
- Score: 32.00097545087518
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The way customers search for and choose products is changing with the rise of large language models (LLMs). LLM-based search, or generative engines, provides direct product recommendations to users, rather than traditional online search results that require users to explore options themselves. However, these recommendations are strongly influenced by the initial retrieval order of LLMs, which disadvantages small businesses and independent creators by limiting their visibility. In this work, we propose CORE, an optimization method that \textbf{C}ontrols \textbf{O}utput \textbf{R}ankings in g\textbf{E}nerative Engines for LLM-based search. Since the LLM's interactions with the search engine are black-box, CORE targets the content returned by search engines as the primary means of influencing output rankings. Specifically, CORE optimizes retrieved content by appending strategically designed optimization content to steer the ranking of outputs. We introduce three types of optimization content: string-based, reasoning-based, and review-based, demonstrating their effectiveness in shaping output rankings. To evaluate CORE in realistic settings, we introduce ProductBench, a large-scale benchmark with 15 product categories and 200 products per category, where each product is associated with its top-10 recommendations collected from Amazon's search interface. Extensive experiments on four LLMs with search capabilities (GPT-4o, Gemini-2.5, Claude-4, and Grok-3) demonstrate that CORE achieves an average Promotion Success Rate of \textbf{91.4\% @Top-5}, \textbf{86.6\% @Top-3}, and \textbf{80.3\% @Top-1}, across 15 product categories, outperforming existing ranking manipulation methods while preserving the fluency of optimized content.
Related papers
- OneSearch: A Preliminary Exploration of the Unified End-to-End Generative Framework for E-commerce Search [43.94443394870866]
OneSearch is the first industrial-deployed end-to-end generative framework for e-commerce search.<n>OneSearch reduces operational expenditure by 75.40% and improves Model FLOPs Utilization from 3.26% to 27.32%.<n>The system has been successfully deployed across multiple search scenarios in Kuaishou, serving millions of users.
arXiv Detail & Related papers (2025-09-03T11:50:04Z) - Role-Augmented Intent-Driven Generative Search Engine Optimization [9.876307656819039]
We propose a Role-Augmented Intent-Driven Generative Search Engine Optimization (G-SEO) method.<n>Our method models search intent through reflective refinement across diverse informational roles, enabling targeted content enhancement.<n> Experimental results demonstrate that search intent serves as an effective signal for guiding content optimization.
arXiv Detail & Related papers (2025-08-15T02:08:55Z) - Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers [74.17516978246152]
Large language models (LLMs) have been widely integrated into information retrieval to advance traditional techniques.<n>We propose EXSEARCH, an agentic search framework, where the LLM learns to retrieve useful information as the reasoning unfolds.<n>Experiments on four knowledge-intensive benchmarks show that EXSEARCH substantially outperforms baselines.
arXiv Detail & Related papers (2025-05-26T15:27:55Z) - DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation [83.21140655248624]
Large language models (LLMs) have been introduced into recommender systems (RSs)<n>We propose DeepRec, a novel LLM-based RS that enables autonomous multi-turn interactions between LLMs and TRMs for deep exploration of the item space.<n> Experiments on public datasets demonstrate that DeepRec significantly outperforms both traditional and LLM-based baselines.
arXiv Detail & Related papers (2025-05-22T15:49:38Z) - Generative Product Recommendations for Implicit Superlative Queries [21.750990820244983]
In Recommender Systems, users often seek the best products through indirect, vague, or under-specified queries, such as "best shoes for trail running"<n>We investigate how Large Language Models can generate implicit attributes for ranking as well as reason over them to improve product recommendations for such queries.
arXiv Detail & Related papers (2025-04-26T00:05:47Z) - Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning [50.419872452397684]
Search-R1 is an extension of reinforcement learning for reasoning frameworks.<n>It generates search queries during step-by-step reasoning with real-time retrieval.<n>It improves performance by 41% (Qwen2.5-7B) and 20% (Qwen2.5-3B) over various RAG baselines.
arXiv Detail & Related papers (2025-03-12T16:26:39Z) - Manipulating Large Language Models to Increase Product Visibility [27.494854085799076]
Large language models (LLMs) are increasingly being integrated into search engines to provide natural language responses tailored to user queries.
We investigate whether recommendations from LLMs can be manipulated to enhance a product's visibility.
arXiv Detail & Related papers (2024-04-11T17:57:32Z) - GenSERP: Large Language Models for Whole Page Presentation [22.354349023665538]
GenSERP is a framework that leverages large language models with vision in a few-shot setting to dynamically organize intermediate search results.
Our approach has three main stages: information gathering, answer generation, and scoring phase.
arXiv Detail & Related papers (2024-02-22T05:41:24Z) - Large Language Models are Zero-Shot Rankers for Recommender Systems [76.02500186203929]
This work aims to investigate the capacity of large language models (LLMs) to act as the ranking model for recommender systems.
We show that LLMs have promising zero-shot ranking abilities but struggle to perceive the order of historical interactions.
We demonstrate that these issues can be alleviated using specially designed prompting and bootstrapping strategies.
arXiv Detail & Related papers (2023-05-15T17:57:39Z) - ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models
in Sponsored Search Engine [123.65646903493614]
Generative retrieval models generate outputs token by token on a path of the target library prefix tree (Trie)
We analyze these problems and propose a looking ahead strategy for generative retrieval models named ProphetNet-Ads.
Compared with Trie-based LSTM generative retrieval model proposed recently, our single model result and integrated result improve the recall by 15.58% and 18.8% respectively with beam size 5.
arXiv Detail & Related papers (2020-10-21T07:03:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.