Related papers: Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

URL: http://arxiv.org/abs/2602.23446v1
Date: Thu, 26 Feb 2026 19:11:32 GMT
Title: Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning
Authors: Alejandro Rodriguez Dominguez,
Abstract summary: We argue that limitations reflect structural properties of the supervision channel rather than model scale or optimization.<n>We develop a unified theory showing that whenever the human supervision channel is not sufficient for a latent evaluation target, it acts as an information-reducing channel.
Score: 51.56484100374058
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models are trained primarily on human-generated data and feedback, yet they exhibit persistent errors arising from annotation noise, subjective preferences, and the limited expressive bandwidth of natural language. We argue that these limitations reflect structural properties of the supervision channel rather than model scale or optimization. We develop a unified theory showing that whenever the human supervision channel is not sufficient for a latent evaluation target, it acts as an information-reducing channel that induces a strictly positive excess-risk floor for any learner dominated by it. We formalize this Human-Bounded Intelligence limit and show that across six complementary frameworks (operator theory, PAC-Bayes, information theory, causal inference, category theory, and game-theoretic analyses of reinforcement learning from human feedback), non-sufficiency yields strictly positive lower bounds arising from the same structural decomposition into annotation noise, preference distortion, and semantic compression. The theory explains why scaling alone cannot eliminate persistent human-aligned errors and characterizes conditions under which auxiliary non-human signals (e.g., retrieval, program execution, tools) increase effective supervision capacity and collapse the floor by restoring information about the latent target. Experiments on real preference data, synthetic known-target tasks, and externally verifiable benchmarks confirm the predicted structural signatures: human-only supervision exhibits a persistent floor, while sufficiently informative auxiliary channels strictly reduce or eliminate excess error.

Related papers

Understanding Degradation with Vision Language Model [56.09241449206817]
Understanding visual degradations is a critical yet challenging problem in computer vision.<n>We introduce DU-VLM, a multimodal chain-of-thought model trained with supervised fine-tuning and reinforcement learning.<n>We also introduce textbfDU-110k, a large-scale dataset comprising 110,000 clean-degraded pairs with grounded physical annotations.
arXiv Detail & Related papers (2026-02-04T13:51:15Z)
Towards Minimal Causal Representations for Human Multimodal Language Understanding [20.44307628909198]
We introduce a Causal Multimodal Information Bottleneck (CaMIB) model that leverages causal principles rather than traditional likelihood.<n>To ensure global consistency of causal features, we incorporate an instrumental variable constraint.<n>Experiments on multimodal sentiment analysis, humor detection, and sarcasm detection, along with OOD test sets, demonstrate the effectiveness of CaMIB.
arXiv Detail & Related papers (2025-09-26T03:04:23Z)
Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training [2.094557609248011]
Large language models increasingly rely on synthetic data due to human-written content scarcity.<n>Recursive training on model-generated outputs leads to model collapse, a degenerative process threatening factual reliability.
arXiv Detail & Related papers (2025-09-05T04:29:15Z)
The wall confronting large language models [0.0]
We show that the scaling laws which determine the performance of large language models severely limit their ability to improve the uncertainty of their predictions.<n>We argue that the very mechanism which fuels much of the learning power of LLMs might well be at the roots of their propensity to produce error pileup.
arXiv Detail & Related papers (2025-07-25T22:48:37Z)
Robust Molecular Property Prediction via Densifying Scarce Labeled Data [53.24886143129006]
In drug discovery, compounds most critical for advancing research often lie beyond the training set.<n>We propose a novel bilevel optimization approach that leverages unlabeled data to interpolate between in-distribution (ID) and out-of-distribution (OOD) data.
arXiv Detail & Related papers (2025-06-13T15:27:40Z)
On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling [15.769249369390884]
We show that optimal learning rates decay slower than theoretically predicted and networks exhibit both stable training and non-trivial feature learning, even at very large widths.<n>In particular, we show that under cross-entropy (CE) loss, the unstable regime comprises two distinct sub-regimes: a catastrophically unstable regime and a more benign controlled divergence regime.<n>Our empirical evidence suggests that width-scaling considerations are surprisingly useful for predicting empirically maximal stable learning rate exponents.
arXiv Detail & Related papers (2025-05-28T15:40:48Z)
SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract. We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z)
Neural Network Approximation for Pessimistic Offline Reinforcement Learning [17.756108291816908]
We present a non-asymptotic estimation error of pessimistic offline RL using general neural network approximation. Our result shows that the estimation error consists of two parts: the first converges to zero at a desired rate on the sample size with partially controllable concentrability, and the second becomes negligible if the residual constraint is tight.
arXiv Detail & Related papers (2023-12-19T05:17:27Z)
Understanding Robust Overfitting from the Feature Generalization Perspective [61.770805867606796]
Adversarial training (AT) constructs robust neural networks by incorporating adversarial perturbations into natural data. It is plagued by the issue of robust overfitting (RO), which severely damages the model's robustness. In this paper, we investigate RO from a novel feature generalization perspective.
arXiv Detail & Related papers (2023-10-01T07:57:03Z)
Neuro-Symbolic Entropy Regularization [78.16196949641079]
In structured prediction, the goal is to jointly predict many output variables that together encode a structured object. One approach -- entropy regularization -- posits that decision boundaries should lie in low-probability regions. We propose a loss, neuro-symbolic entropy regularization, that encourages the model to confidently predict a valid object.
arXiv Detail & Related papers (2022-01-25T06:23:10Z)
An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction [84.49035467829819]
We show that it is possible to better manage this trade-off by optimizing a bound on the Information Bottleneck (IB) objective. Our fully unsupervised approach jointly learns an explainer that predicts sparse binary masks over sentences, and an end-task predictor that considers only the extracted rationale.
arXiv Detail & Related papers (2020-05-01T23:26:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.