Incentivizing Quality Text Generation via Statistical Contracts
- URL: http://arxiv.org/abs/2406.11118v1
- Date: Mon, 17 Jun 2024 00:30:58 GMT
- Title: Incentivizing Quality Text Generation via Statistical Contracts
- Authors: Eden Saig, Ohad Einav, Inbal Talgam-Cohen,
- Abstract summary: We propose a pay-for-performance, contract-based framework for incentivizing quality.
We study a principal-agent game where the agent generates text using costly inference.
We find that cost-robust contracts sacrifice only a marginal increase in objective value compared to their cost-aware counterparts.
- Score: 7.303977308530667
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While the success of large language models (LLMs) increases demand for machine-generated text, current pay-per-token pricing schemes create a misalignment of incentives known in economics as moral hazard: Text-generating agents have strong incentive to cut costs by preferring a cheaper model over the cutting-edge one, and this can be done "behind the scenes" since the agent performs inference internally. In this work, we approach this issue from an economic perspective, by proposing a pay-for-performance, contract-based framework for incentivizing quality. We study a principal-agent game where the agent generates text using costly inference, and the contract determines the principal's payment for the text according to an automated quality evaluation. Since standard contract theory is inapplicable when internal inference costs are unknown, we introduce cost-robust contracts. As our main theoretical contribution, we characterize optimal cost-robust contracts through a direct correspondence to optimal composite hypothesis tests from statistics, generalizing a result of Saig et al. (NeurIPS'23). We evaluate our framework empirically by deriving contracts for a range of objectives and LLM evaluation benchmarks, and find that cost-robust contracts sacrifice only a marginal increase in objective value compared to their cost-aware counterparts.
Related papers
- Contractual Reinforcement Learning: Pulling Arms with Invisible Hands [68.77645200579181]
We propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design.
For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent.
For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation.
arXiv Detail & Related papers (2024-07-01T16:53:00Z) - New Perspectives in Online Contract Design [2.296475290901356]
This work studies the repeated principal-agent problem from an online learning perspective.
The principal's goal is to learn the optimal contract that maximizes her utility through repeated interactions.
arXiv Detail & Related papers (2024-03-11T20:28:23Z) - $\texttt{COSMIC}$: Mutual Information for Task-Agnostic Summarization Evaluation [39.287235598507294]
We propose a novel task-oriented evaluation approach that assesses summarizers based on their capacity to produce summaries that are useful for downstream tasks, while preserving task outcomes.
We introduce $textttCOSMIC$ as a practical implementation of this metric, demonstrating its strong correlation with human judgment-based metrics and its effectiveness in predicting downstream task performance.
arXiv Detail & Related papers (2024-02-29T18:51:23Z) - Incentivized Truthful Communication for Federated Bandits [61.759855777522255]
We propose an incentive compatible (i.e., truthful) communication protocol, named Truth-FedBan.
We show that Truth-FedBan still guarantees the sub-linear regret and communication cost without any overheads.
arXiv Detail & Related papers (2024-02-07T00:23:20Z) - Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues [47.977032883078664]
We develop assistive agents based on Large Language Models (LLMs)
We simulate business negotiations by letting two LLM-based agents engage in role play.
A third LLM acts as a remediator agent to rewrite utterances violating norms for improving negotiation outcomes.
arXiv Detail & Related papers (2024-01-29T09:07:40Z) - Sentiment Analysis through LLM Negotiations [58.67939611291001]
A standard paradigm for sentiment analysis is to rely on a singular LLM and makes the decision in a single round.
This paper introduces a multi-LLM negotiation framework for sentiment analysis.
arXiv Detail & Related papers (2023-11-03T12:35:29Z) - Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden
Rewards [4.742123770879715]
In practice, incentive providers often cannot observe the reward realizations of incentivized agents.
This paper explores a repeated adverse selection game between a self-interested learning agent and a learning principal.
We introduce an estimator whose only input is the history of principal's incentives and agent's choices.
arXiv Detail & Related papers (2023-08-13T08:12:01Z) - Delegated Classification [21.384062337682185]
We propose a theoretical framework for incentive-aware delegation of machine learning tasks.
We define budget-optimal contracts and prove they take a simple threshold form under reasonable assumptions.
Empirically, we demonstrate that budget-optimal contracts can be constructed using small-scale data.
arXiv Detail & Related papers (2023-06-20T11:59:03Z) - ContractNLI: A Dataset for Document-level Natural Language Inference for
Contracts [39.75232199445175]
We propose "document-level natural language inference (NLI) for contracts"
A system is given a set of hypotheses and a contract, and it is asked to classify whether each hypothesis is "entailed by", "contradicting to" or "not mentioned by" (neutral to) the contract.
We release the largest corpus to date consisting of 607 annotated contracts.
arXiv Detail & Related papers (2021-10-05T03:22:31Z) - Measuring Association Between Labels and Free-Text Rationales [60.58672852655487]
In interpretable NLP, we require faithful rationales that reflect the model's decision-making process for an explained instance.
We demonstrate that pipelines, existing models for faithful extractive rationalization on information-extraction style tasks, do not extend as reliably to "reasoning" tasks requiring free-text rationales.
We turn to models that jointly predict and rationalize, a class of widely used high-performance models for free-text rationalization whose faithfulness is not yet established.
arXiv Detail & Related papers (2020-10-24T03:40:56Z) - Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning [100.73223416589596]
We propose a cost-sensitive portfolio selection method with deep reinforcement learning.
Specifically, a novel two-stream portfolio policy network is devised to extract both price series patterns and asset correlations.
A new cost-sensitive reward function is developed to maximize the accumulated return and constrain both costs via reinforcement learning.
arXiv Detail & Related papers (2020-03-06T06:28:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.