MaxShapley: Towards Incentive-compatible Generative Search with Fair Context Attribution
- URL: http://arxiv.org/abs/2512.05958v1
- Date: Fri, 05 Dec 2025 18:54:21 GMT
- Title: MaxShapley: Towards Incentive-compatible Generative Search with Fair Context Attribution
- Authors: Sara Patel, Mingxun Zhou, Giulia Fanti,
- Abstract summary: We introduce MaxShapley, an efficient algorithm for fair attribution in generative search pipelines that use retrieval-augmented generation (RAG)<n>We evaluate MaxShapley on three multi-hop QA datasets (HotPotQA, MuSiQUE, MS MARCO)
- Score: 17.58298150582672
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative search engines based on large language models (LLMs) are replacing traditional search, fundamentally changing how information providers are compensated. To sustain this ecosystem, we need fair mechanisms to attribute and compensate content providers based on their contributions to generated answers. We introduce MaxShapley, an efficient algorithm for fair attribution in generative search pipelines that use retrieval-augmented generation (RAG). MaxShapley is a special case of the celebrated Shapley value; it leverages a decomposable max-sum utility function to compute attributions with linear computation in the number of documents, as opposed to the exponential cost of Shapley values. We evaluate MaxShapley on three multi-hop QA datasets (HotPotQA, MuSiQUE, MS MARCO); MaxShapley achieves comparable attribution quality to exact Shapley computation, while consuming a fraction of its tokens--for instance, it gives up to an 8x reduction in resource consumption over prior state-of-the-art methods at the same attribution accuracy.
Related papers
- Shapley-Inspired Feature Weighting in $k$-means with No Additional Hyperparameters [2.3940819037450987]
Clustering algorithms often assume all features contribute equally to the data structure.<n>We propose SHARK (Shapley Reweighted $k$-means), a feature-weighted clustering algorithm motivated by the use of Shapley values.<n>Experiments on synthetic and real-world data sets show that SHARK consistently matches or outperforms existing methods.
arXiv Detail & Related papers (2025-08-11T13:07:21Z) - Fair Document Valuation in LLM Summaries via Shapley Values [0.0]
Large Language Models (LLMs) are increasingly used in systems that retrieve and summarize content from multiple sources.<n>These systems obscure the individual contributions of original content creators, raising concerns about credit attribution and compensation.<n>We propose a Shapley value-based framework for fair document valuation.
arXiv Detail & Related papers (2025-05-28T15:14:21Z) - SIM-Shapley: A Stable and Computationally Efficient Approach to Shapley Value Approximation [10.009607907227293]
Shapley value (SV) methods provide a principled framework for feature attribution in complex models but incur high computational costs.<n>We propose Iterative Momentum for Shapley Value Approximation (SIM-Shapley), a stable and efficient approximation method inspired by optimization.<n>In our numerical experiments, SIM-Shapley reduces computation time by up to 85% relative to state-of-the-art baselines.
arXiv Detail & Related papers (2025-05-13T03:23:10Z) - Value-Based Deep RL Scales Predictably [100.21834069400023]
We show that value-based off-policy RL methods are predictable despite community lore regarding their pathological behavior.<n>We validate our approach using three algorithms: SAC, BRO, and PQL on DeepMind Control, OpenAI gym, and IsaacGym.
arXiv Detail & Related papers (2025-02-06T18:59:47Z) - Adaptive Sampled Softmax with Inverted Multi-Index: Methods, Theory and Applications [79.53938312089308]
The MIDX-Sampler is a novel adaptive sampling strategy based on an inverted multi-index approach.<n>Our method is backed by rigorous theoretical analysis, addressing key concerns such as sampling bias, gradient bias, convergence rates, and generalization error bounds.
arXiv Detail & Related papers (2025-01-15T04:09:21Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - An Efficient Shapley Value Computation for the Naive Bayes Classifier [0.0]
This article proposes an exact analytic expression of Shapley values in the case of the naive Bayes classifier.
Results show that our Shapley proposal for the naive Bayes provides informative results with low algorithmic complexity.
arXiv Detail & Related papers (2023-07-31T14:39:10Z) - Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - Exact Shapley Values for Local and Model-True Explanations of Decision
Tree Ensembles [0.0]
We consider the application of Shapley values for explaining decision tree ensembles.
We present a novel approach to Shapley value-based feature attribution that can be applied to random forests and boosted decision trees.
arXiv Detail & Related papers (2021-12-16T20:16:02Z) - Fast Hierarchical Games for Image Explanations [78.16853337149871]
We present a model-agnostic explanation method for image classification based on a hierarchical extension of Shapley coefficients.
Unlike other Shapley-based explanation methods, h-Shap is scalable and can be computed without the need of approximation.
We compare our hierarchical approach with popular Shapley-based and non-Shapley-based methods on a synthetic dataset, a medical imaging scenario, and a general computer vision problem.
arXiv Detail & Related papers (2021-04-13T13:11:02Z) - Towards Efficient Data Valuation Based on the Shapley Value [65.4167993220998]
We study the problem of data valuation by utilizing the Shapley value.
The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value.
We propose a repertoire of efficient algorithms for approximating the Shapley value.
arXiv Detail & Related papers (2019-02-27T00:22:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.