Related papers: Incentivizing Quality Text Generation via Statistical Contracts

Incentivizing Quality Text Generation via Statistical Contracts

URL: http://arxiv.org/abs/2406.11118v1
Date: Mon, 17 Jun 2024 00:30:58 GMT
Title: Incentivizing Quality Text Generation via Statistical Contracts
Authors: Eden Saig, Ohad Einav, Inbal Talgam-Cohen,
Abstract summary: We propose a pay-for-performance, contract-based framework for incentivizing quality. We study a principal-agent game where the agent generates text using costly inference. We find that cost-robust contracts sacrifice only a marginal increase in objective value compared to their cost-aware counterparts.
Score: 7.303977308530667
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While the success of large language models (LLMs) increases demand for machine-generated text, current pay-per-token pricing schemes create a misalignment of incentives known in economics as moral hazard: Text-generating agents have strong incentive to cut costs by preferring a cheaper model over the cutting-edge one, and this can be done "behind the scenes" since the agent performs inference internally. In this work, we approach this issue from an economic perspective, by proposing a pay-for-performance, contract-based framework for incentivizing quality. We study a principal-agent game where the agent generates text using costly inference, and the contract determines the principal's payment for the text according to an automated quality evaluation. Since standard contract theory is inapplicable when internal inference costs are unknown, we introduce cost-robust contracts. As our main theoretical contribution, we characterize optimal cost-robust contracts through a direct correspondence to optimal composite hypothesis tests from statistics, generalizing a result of Saig et al. (NeurIPS'23). We evaluate our framework empirically by deriving contracts for a range of objectives and LLM evaluation benchmarks, and find that cost-robust contracts sacrifice only a marginal increase in objective value compared to their cost-aware counterparts.

Related papers

Probabilistically Tightened Linear Relaxation-based Perturbation Analysis for Neural Network Verification [83.25968588249776]
We present a novel framework that combines over-approximation techniques from LiRPA-based approaches with a sampling-based method to compute tight intermediate reachable sets.<n>With negligible computational overhead, $textttPT-LiRPA$ exploiting the estimated reachable sets, significantly tightens the lower and upper linear bounds of a neural network's output.
arXiv Detail & Related papers (2025-07-07T18:45:53Z)
Economic Evaluation of LLMs [0.9208007322096532]
We show that reasoning models offer better accuracy-cost tradeoffs as soon as the economic cost of a mistake exceeds $0.01.<n>We find that single large LLMs often outperform cascades when the cost of making a mistake is as low as $0.1.
arXiv Detail & Related papers (2025-07-04T23:16:02Z)
Cost-Optimal Active AI Model Evaluation [71.2069549142394]
Development of generative AI systems requires continual evaluation, data acquisition, and annotation.<n>We develop novel, cost-aware methods for actively balancing the use of a cheap, but often inaccurate, weak rater.<n>We derive a family of cost-optimal policies for allocating a given annotation budget between weak and strong raters.
arXiv Detail & Related papers (2025-06-09T17:14:41Z)
Designing DSIC Mechanisms for Data Sharing in the Era of Large Language Models [0.0]
Training large language models (LLMs) requires vast amounts of high-quality data from institutions that face legal, privacy, and strategic constraints.<n>We introduce a mechanism-design framework for truthful, trust-minimized data sharing.<n>We formalize a model where providers privately know their data cost and quality, and value arises solely from the data's contribution to model performance.
arXiv Detail & Related papers (2025-06-01T22:17:18Z)
Supervised Optimism Correction: Be Confident When LLMs Are Sure [91.7459076316849]
We establish a novel theoretical connection between supervised fine-tuning and offline reinforcement learning. We show that the widely used beam search method suffers from unacceptable over-optimism. We propose Supervised Optimism Correction, which introduces a simple yet effective auxiliary loss for token-level $Q$-value estimations.
arXiv Detail & Related papers (2025-04-10T07:50:03Z)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [60.881609323604685]
Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge. Users pay for services based on advertised model capabilities. providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs. This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization [90.15027447565427]
Chain of thought (CoT) generates free-text explanations that help guide a model's predictions. Self-Consistency (SC) marginalizes predictions over multiple generated explanations. We propose $textbfC$hain-$textbfo$f-$textbfKe$ywords (CoKe)
arXiv Detail & Related papers (2025-03-21T13:37:46Z)
How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators [8.244694683982784]
We investigate the questions of assessing the performance of human annotators and incentivizing them to provide high-quality annotations. We develop a principal-agent model to characterize the behaviors of and the interactions between the company and the human annotators. The model rationalizes a practical mechanism of a bonus scheme to incentivize annotators which benefits both parties.
arXiv Detail & Related papers (2025-02-10T12:15:27Z)
Contractual Reinforcement Learning: Pulling Arms with Invisible Hands [68.77645200579181]
We propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design. For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent. For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation.
arXiv Detail & Related papers (2024-07-01T16:53:00Z)
New Perspectives in Online Contract Design [2.296475290901356]
This work studies the repeated principal-agent problem from an online learning perspective. The principal's goal is to learn the optimal contract that maximizes her utility through repeated interactions.
arXiv Detail & Related papers (2024-03-11T20:28:23Z)
$\texttt{COSMIC}$: Mutual Information for Task-Agnostic Summarization Evaluation [39.287235598507294]
We propose a novel task-oriented evaluation approach that assesses summarizers based on their capacity to produce summaries that are useful for downstream tasks, while preserving task outcomes. We introduce $textttCOSMIC$ as a practical implementation of this metric, demonstrating its strong correlation with human judgment-based metrics and its effectiveness in predicting downstream task performance.
arXiv Detail & Related papers (2024-02-29T18:51:23Z)
Incentivized Truthful Communication for Federated Bandits [61.759855777522255]
We propose an incentive compatible (i.e., truthful) communication protocol, named Truth-FedBan. We show that Truth-FedBan still guarantees the sub-linear regret and communication cost without any overheads.
arXiv Detail & Related papers (2024-02-07T00:23:20Z)
Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues [47.977032883078664]
We develop assistive agents based on Large Language Models (LLMs) We simulate business negotiations by letting two LLM-based agents engage in role play. A third LLM acts as a remediator agent to rewrite utterances violating norms for improving negotiation outcomes.
arXiv Detail & Related papers (2024-01-29T09:07:40Z)
Sentiment Analysis through LLM Negotiations [58.67939611291001]
A standard paradigm for sentiment analysis is to rely on a singular LLM and makes the decision in a single round. This paper introduces a multi-LLM negotiation framework for sentiment analysis.
arXiv Detail & Related papers (2023-11-03T12:35:29Z)
Automating construction contract review using knowledge graph-enhanced large language models [1.50580995941543]
This paper investigates whether integrating Large Language Models (LLMs) and Knowledge Graphs (KGs) can enhance the accuracy and interpretability of automated contract risk identification.<n>A tuning-free approach is proposed that integrates LLMs with a Nested Contract Knowledge Graph (NCKG) using a Graph Retrieval-Augmented Generation (GraphRAG) framework for contract knowledge retrieval and reasoning.<n>Tested on international EPC contracts, the method achieves more accurate risk evaluation and interpretable risk summaries than baseline models.
arXiv Detail & Related papers (2023-09-21T14:53:36Z)
Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden Rewards [4.742123770879715]
In practice, incentive providers often cannot observe the reward realizations of incentivized agents. This paper explores a repeated adverse selection game between a self-interested learning agent and a learning principal. We introduce an estimator whose only input is the history of principal's incentives and agent's choices.
arXiv Detail & Related papers (2023-08-13T08:12:01Z)
Delegated Classification [21.384062337682185]
We propose a theoretical framework for incentive-aware delegation of machine learning tasks. We define budget-optimal contracts and prove they take a simple threshold form under reasonable assumptions. Empirically, we demonstrate that budget-optimal contracts can be constructed using small-scale data.
arXiv Detail & Related papers (2023-06-20T11:59:03Z)
ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts [39.75232199445175]
We propose "document-level natural language inference (NLI) for contracts" A system is given a set of hypotheses and a contract, and it is asked to classify whether each hypothesis is "entailed by", "contradicting to" or "not mentioned by" (neutral to) the contract. We release the largest corpus to date consisting of 607 annotated contracts.
arXiv Detail & Related papers (2021-10-05T03:22:31Z)
Measuring Association Between Labels and Free-Text Rationales [60.58672852655487]
In interpretable NLP, we require faithful rationales that reflect the model's decision-making process for an explained instance. We demonstrate that pipelines, existing models for faithful extractive rationalization on information-extraction style tasks, do not extend as reliably to "reasoning" tasks requiring free-text rationales. We turn to models that jointly predict and rationalize, a class of widely used high-performance models for free-text rationalization whose faithfulness is not yet established.
arXiv Detail & Related papers (2020-10-24T03:40:56Z)
Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning [100.73223416589596]
We propose a cost-sensitive portfolio selection method with deep reinforcement learning. Specifically, a novel two-stream portfolio policy network is devised to extract both price series patterns and asset correlations. A new cost-sensitive reward function is developed to maximize the accumulated return and constrain both costs via reinforcement learning.
arXiv Detail & Related papers (2020-03-06T06:28:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.