Ad Insertion in LLM-Generated Responses
- URL: http://arxiv.org/abs/2601.19435v1
- Date: Tue, 27 Jan 2026 10:16:03 GMT
- Title: Ad Insertion in LLM-Generated Responses
- Authors: Shengwei Xu, Zhaohua Chen, Xiaotie Deng, Zhiyi Huang, Grant Schoenebeck,
- Abstract summary: We propose a practical framework that resolves tensions through two decoupling strategies.<n>First, we decouple ad insertion from response generation to ensure safety and explicit disclosure.<n>Second, we decouple bidding from specific user queries by using genres'' (high-level semantic clusters) as a proxy.
- Score: 16.434649348378706
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sustainable monetization of Large Language Models (LLMs) remains a critical open challenge. Traditional search advertising, which relies on static keywords, fails to capture the fleeting, context-dependent user intents--the specific information, goods, or services a user seeks--embedded in conversational flows. Beyond the standard goal of social welfare maximization, effective LLM advertising imposes additional requirements on contextual coherence (ensuring ads align semantically with transient user intents) and computational efficiency (avoiding user interaction latency), as well as adherence to ethical and regulatory standards, including preserving privacy and ensuring explicit ad disclosure. Although various recent solutions have explored bidding on token-level and query-level, both categories of approaches generally fail to holistically satisfy this multifaceted set of constraints. We propose a practical framework that resolves these tensions through two decoupling strategies. First, we decouple ad insertion from response generation to ensure safety and explicit disclosure. Second, we decouple bidding from specific user queries by using ``genres'' (high-level semantic clusters) as a proxy. This allows advertisers to bid on stable categories rather than sensitive real-time response, reducing computational burden and privacy risks. We demonstrate that applying the VCG auction mechanism to this genre-based framework yields approximately dominant strategy incentive compatibility (DSIC) and individual rationality (IR), as well as approximately optimal social welfare, while maintaining high computational efficiency. Finally, we introduce an "LLM-as-a-Judge" metric to estimate contextual coherence. Our experiments show that this metric correlates strongly with human ratings (Spearman's $ρ\approx 0.66$), outperforming 80% of individual human evaluators.
Related papers
- NeuroFilter: Privacy Guardrails for Conversational LLM Agents [50.75206727081996]
This work addresses the computational challenge of enforcing privacy for agentic Large Language Models (LLMs)<n>NeuroFilter is a guardrail framework that operationalizes contextual integrity by mapping norm violations to simple directions in the model's activation space.<n>A comprehensive evaluation across over 150,000 interactions, covering models from 7B to 70B parameters, illustrates the strong performance of NeuroFilter.
arXiv Detail & Related papers (2026-01-21T05:16:50Z) - Look Twice before You Leap: A Rational Agent Framework for Localized Adversarial Anonymization [5.308328605042682]
We propose Rational Localized Adversarial Anonymization (RLAA)<n> RLAA is a fully localized and training-free framework featuring an Attacker-Arbitrator-Anonymizer architecture.<n>Our code and datasets will be released upon acceptance.
arXiv Detail & Related papers (2025-12-07T08:03:43Z) - Towards Context-aware Reasoning-enhanced Generative Searching in E-commerce [61.03081096959132]
We propose a context-aware reasoning-enhanced generative search framework for better textbfunderstanding the complicated context.<n>Our approach achieves superior performance compared with strong baselines, validating its effectiveness for search-based recommendation.
arXiv Detail & Related papers (2025-10-19T16:46:11Z) - The Hidden Cost of Modeling P(X): Vulnerability to Membership Inference Attacks in Generative Text Classifiers [6.542294761666199]
Membership Inference Attacks (MIAs) pose a critical privacy threat by enabling adversaries to determine whether a specific sample was included in a model's training dataset.<n>We show that fully generative classifiers which explicitly model the joint likelihood $P(X,Y)$ are most vulnerable to membership leakage.
arXiv Detail & Related papers (2025-10-17T18:09:33Z) - Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation [54.61034867177997]
Caching inference responses allows them to be retrieved without another forward pass through the Large Language Models.<n>Traditional exact-match caching overlooks the semantic similarity between queries, leading to unnecessary recomputation.<n>We present a principled, learning-based framework for semantic cache eviction under unknown query and cost distributions.
arXiv Detail & Related papers (2025-08-11T06:53:27Z) - Beyond Jailbreaking: Auditing Contextual Privacy in LLM Agents [43.303548143175256]
This study proposes an auditing framework for conversational privacy that quantifies an agent's susceptibility to risks.<n>The proposed Conversational Manipulation for Privacy Leakage (CMPL) framework is designed to stress-test agents that enforce strict privacy directives.
arXiv Detail & Related papers (2025-06-11T20:47:37Z) - Urania: Differentially Private Insights into AI Use [102.27238986985698]
$Urania$ provides end-to-end privacy protection by leveraging DP tools such as clustering, partition selection, and histogram-based summarization.<n>Results show the framework's ability to extract meaningful conversational insights while maintaining stringent user privacy.
arXiv Detail & Related papers (2025-06-05T07:00:31Z) - Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time [52.230936493691985]
We propose SITAlign, an inference framework that addresses the multifaceted nature of alignment by maximizing a primary objective while satisfying threshold-based constraints on secondary criteria.<n>We provide theoretical insights by deriving sub-optimality bounds of our satisficing based inference alignment approach.
arXiv Detail & Related papers (2025-05-29T17:56:05Z) - Multi-agents based User Values Mining for Recommendation [52.26100802380767]
We propose a zero-shot multi-LLM collaborative framework for effective and accurate user value extraction.<n>We apply text summarization techniques to condense item content while preserving essential meaning.<n>To mitigate hallucinations, we introduce two specialized agent roles: evaluators and supervisors.
arXiv Detail & Related papers (2025-05-02T04:01:31Z) - Generating Privacy-Preserving Personalized Advice with Zero-Knowledge Proofs and LLMs [0.6906005491572401]
We propose a framework that integrates zero-knowledge proof technology, specifically zkVM, with large language models (LLMs)<n>This integration enables privacy-preserving data sharing by verifying user traits without disclosing sensitive information.
arXiv Detail & Related papers (2025-02-10T13:02:00Z) - DePrompt: Desensitization and Evaluation of Personal Identifiable Information in Large Language Model Prompts [11.883785681042593]
DePrompt is a desensitization protection and effectiveness evaluation framework for prompt.
We integrate contextual attributes to define privacy types, achieving high-precision PII entity identification.
Our framework is adaptable to prompts and can be extended to text usability-dependent scenarios.
arXiv Detail & Related papers (2024-08-16T02:38:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.