Related papers: LLM Collusion

LLM Collusion

URL: http://arxiv.org/abs/2601.01279v1
Date: Sat, 03 Jan 2026 20:38:21 GMT
Title: LLM Collusion
Authors: Shengyu Cao, Ming Hu,
Abstract summary: Large language models (LLMs) can facilitate collusion in a duopoly when both sellers rely on the same pre-trained model.<n>We show that configuring LLMs for robustness and robustness can induce collusion via a phase transition.
Score: 5.363252654303049
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We study how delegating pricing to large language models (LLMs) can facilitate collusion in a duopoly when both sellers rely on the same pre-trained model. The LLM is characterized by (i) a propensity parameter capturing its internal bias toward high-price recommendations and (ii) an output-fidelity parameter measuring how tightly outputs track that bias; the propensity evolves through retraining. We show that configuring LLMs for robustness and reproducibility can induce collusion via a phase transition: there exists a critical output-fidelity threshold that pins down long-run behavior. Below it, competitive pricing is the unique long-run outcome. Above it, the system is bistable, with competitive and collusive pricing both locally stable and the realized outcome determined by the model's initial preference. The collusive regime resembles tacit collusion: prices are elevated on average, yet occasional low-price recommendations provide plausible deniability. With perfect fidelity, full collusion emerges from any interior initial condition. For finite training batches of size $b$, infrequent retraining (driven by computational costs) further amplifies collusion: conditional on starting in the collusive basin, the probability of collusion approaches one as $b$ grows, since larger batches dampen stochastic fluctuations that might otherwise tip the system toward competition. The indeterminacy region shrinks at rate $O(1/\sqrt{b})$.

Related papers

Information Fidelity in Tool-Using LLM Agents: A Martingale Analysis of the Model Context Protocol [69.11739400975445]
We introduce the first theoretical framework for analyzing error accumulation in Model Context Protocol (MCP) agents.<n>We show that cumulative distortion exhibits linear growth and high-probability deviations bounded by $O(sqrtT)$.<n>Key findings include: semantic weighting reduces distortion by 80%, and periodic re-grounding approximately every 9 steps suffices for error control.
arXiv Detail & Related papers (2026-02-10T21:08:53Z)
Phase Transition for Budgeted Multi-Agent Synergy [41.486076708302456]
Multi-agent systems can improve reliability, yet under a fixed inference budget they often help, saturate, or even collapse.<n>We develop a minimal and calibratable theory that predicts these regimes from three binding constraints of modern agent stacks.
arXiv Detail & Related papers (2026-01-24T05:32:50Z)
Learning Shrinks the Hard Tail: Training-Dependent Inference Scaling in a Solvable Linear Model [2.7074235008521246]
We analyze neural scaling laws in a solvable model of last-layer fine-tuning where targets have intrinsic, instance-heterogeneous difficulty.<n>We show that learning shrinks the hard tail'' of the error distribution.
arXiv Detail & Related papers (2026-01-07T10:00:17Z)
Semantic Faithfulness and Entropy Production Measures to Tame Your LLM Demons and Manage Hallucinations [0.0]
We propose two new metrics for faithfulness evaluation using insights from information theory and thermodynamics.<n>We model Question-Context-Answer (QCA) triplets as probability distributions over shared topics.<n>We show that high faithfulness generally implies low entropy production.
arXiv Detail & Related papers (2025-12-04T03:47:37Z)
ZIP-RC: Optimizing Test-Time Compute via Zero-Overhead Joint Reward-Cost Prediction [57.799425838564]
We present ZIP-RC, an adaptive inference method that equips models with zero-overhead inference-time predictions of reward and cost.<n> ZIP-RC improves accuracy by up to 12% over majority voting at equal or lower average cost.
arXiv Detail & Related papers (2025-12-01T09:44:31Z)
Tacit Bidder-Side Collusion: Artificial Intelligence in Dynamic Auctions [0.0]
We study whether large language models acting as autonomous bidders can tacitly collude by coordinating when to accept platform posted payouts in repeated Dutch auctions.<n>We present a minimal repeated auction model that yields a simple incentive compatibility condition and a closed form threshold for sustainable collusion for Nash equilibria.
arXiv Detail & Related papers (2025-11-26T18:32:18Z)
LLMs Can Get "Brain Rot"! [68.08198331505695]
Continual exposure to junk web text induces lasting cognitive decline in large language models (LLMs)<n>We run controlled experiments on real Twitter/X corpora, constructing junk and reversely controlled datasets.<n>Results provide significant, multi-perspective evidence that data quality is a causal driver of LLM capability decay.
arXiv Detail & Related papers (2025-10-15T13:28:49Z)
The Alignment Bottleneck [0.0]
We model the loop as a two-stage cascade $U to H to Y$ given $S$, with cognitive capacity $C_textcog|S$ and average total capacity $barC_texttot|S$.<n>It pairs a data size-independent Fano lower bound proved on a separable codebook mixture with a PAC-Bayes upper bound whose KL term is controlled by the same channel via $m, barC_texttot|S$.
arXiv Detail & Related papers (2025-09-19T12:38:30Z)
Decision from Suboptimal Classifiers: Excess Risk Pre- and Post-Calibration [52.70324949884702]
We quantify the excess risk incurred using approximate posterior probabilities in batch binary decision-making.<n>We identify regimes where recalibration alone addresses most of the regret, and regimes where the regret is dominated by the grouping loss.<n>On NLP experiments, we show that these quantities identify when the expected gain of more advanced post-training is worth the operational cost.
arXiv Detail & Related papers (2025-03-23T10:52:36Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.<n>We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.<n>Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning [59.02006924867438]
Off-policy evaluation and learning (OPE/L) use offline observational data to make better decisions. Recent work proposed distributionally robust OPE/L (DROPE/L) to remedy this, but the proposal relies on inverse-propensity weighting. We propose the first DR algorithms for DROPE/L with KL-divergence uncertainty sets.
arXiv Detail & Related papers (2022-02-19T20:00:44Z)
On Dynamic Pricing with Covariates [6.6543199581017625]
We show that UCB and Thompson sampling-based pricing algorithms can achieve an $O(dsqrtTlog T)$ regret upper bound. Our upper bound on the regret matches the lower bound up to logarithmic factors.
arXiv Detail & Related papers (2021-12-25T16:30:13Z)
An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap [66.75488143823337]
We show that an exponential sample complexity lower bound still holds even if a constant suboptimality gap is assumed. Perhaps surprisingly, this implies an exponential separation between the online RL setting and the generative model setting.
arXiv Detail & Related papers (2021-03-23T17:05:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.