Related papers: A Model of Artificial Jagged Intelligence

A Model of Artificial Jagged Intelligence

URL: http://arxiv.org/abs/2601.07573v1
Date: Mon, 12 Jan 2026 14:27:30 GMT
Title: A Model of Artificial Jagged Intelligence
Authors: Joshua Gans,
Abstract summary: Generative AI systems often display highly uneven performance across tasks that appear nearby''<n>We call this phenomenon Artificial Jagged Intelligence (AJI)<n>This paper develops a tractable economic model of AJI that treats adoption as an information problem.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative AI systems often display highly uneven performance across tasks that appear ``nearby'': they can be excellent on one prompt and confidently wrong on another with only small changes in wording or context. We call this phenomenon Artificial Jagged Intelligence (AJI). This paper develops a tractable economic model of AJI that treats adoption as an information problem: users care about \emph{local} reliability, but typically observe only coarse, global quality signals. In a baseline one-dimensional landscape, truth is a rough Brownian process, and the model ``knows'' scattered points drawn from a Poisson process. The model interpolates optimally, and the local error is measured by posterior variance. We derive an adoption threshold for a blind user, show that experienced errors are amplified by the inspection paradox, and interpret scaling laws as denser coverage that improves average quality without eliminating jaggedness. We then study mastery and calibration: a calibrated user who can condition on local uncertainty enjoys positive expected value even in domains that fail the blind adoption test. Modelling mastery as learning a reliability map via Gaussian process regression yields a learning-rate bound driven by information gain, clarifying when discovering ``where the model works'' is slow. Finally, we study how scaling interacts with discoverability: when calibrated signals and user mastery accelerate the harvesting of scale improvements, and when opacity can make gains from scaling effectively invisible.

Related papers

Your AI-Generated Image Detector Can Secretly Achieve SOTA Accuracy, If Calibrated [40.02006384527024]
Despite being trained on balanced datasets, existing AI-generated image detectors often exhibit systematic bias at test time.<n>We propose a theoretically grounded post-hoc calibration framework based on Bayesian decision theory.<n>Our approach significantly improves robustness without retraining, offering a lightweight and principled solution for reliable and adaptive AI-generated image detection.
arXiv Detail & Related papers (2026-02-02T11:26:37Z)
Beyond the Loss Curve: Scaling Laws, Active Learning, and the Limits of Learning from Exact Posteriors [8.410613979416203]
We use class-conditional normalizing flows as oracles that make exact posteriors tractable on realistic images.<n>Our framework reveals that standard metrics hide ongoing learning, mask architectural differences, and cannot diagnose the nature of distribution shift.
arXiv Detail & Related papers (2026-01-30T21:08:55Z)
Sample Margin-Aware Recalibration of Temperature Scaling [20.87493013833571]
Recent advances in deep learning have significantly improved predictive accuracy.<n>Modern neural networks remain systematically overconfident, posing risks for deployment in safety-critical scenarios.<n>We propose a lightweight, data-efficient recalibration method that precisely scales logits based on the margin between the top two logits.
arXiv Detail & Related papers (2025-06-30T03:35:05Z)
Bayesian generative models can flag performance loss, bias, and out-of-distribution image content [15.835055687646507]
Generative models are popular for medical imaging tasks such as anomaly detection, feature extraction, data visualization, or image generation.<n>Since they are parameterized by deep learning models, they are often sensitive to distribution shifts and unreliable when applied to out-of-distribution data.<n>We show how pixel-wise uncertainty can detect out-of-distribution image content such as ink, rulers, and patches.
arXiv Detail & Related papers (2025-03-21T18:45:28Z)
Calibrating Deep Neural Network using Euclidean Distance [5.3612053942581275]
In machine learning, Focal Loss is commonly used to reduce misclassification rates by emphasizing hard-to-classify samples.<n>High calibration error indicates a misalignment between predicted probabilities and actual outcomes, affecting model reliability.<n>This research introduces a novel loss function called Focal Loss (FCL), designed to improve probability calibration while retaining the advantages of Focal Loss in handling difficult samples.
arXiv Detail & Related papers (2024-10-23T23:06:50Z)
Proximity-Informed Calibration for Deep Neural Networks [49.330703634912915]
ProCal is a plug-and-play algorithm with a theoretical guarantee to adjust sample confidence based on proximity. We show that ProCal is effective in addressing proximity bias and improving calibration on balanced, long-tail, and distribution-shift settings.
arXiv Detail & Related papers (2023-06-07T16:40:51Z)
Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks. Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z)
Certifying Model Accuracy under Distribution Shifts [151.67113334248464]
We present provable robustness guarantees on the accuracy of a model under bounded Wasserstein shifts of the data distribution. We show that a simple procedure that randomizes the input of the model within a transformation space is provably robust to distributional shifts under the transformation.
arXiv Detail & Related papers (2022-01-28T22:03:50Z)
Monitoring Model Deterioration with Explainable Uncertainty Estimation via Non-parametric Bootstrap [0.0]
Monitoring machine learning models once they are deployed is challenging. It is even more challenging to decide when to retrain models in real-case scenarios when labeled data is beyond reach. In this work, we use non-parametric bootstrapped uncertainty estimates and SHAP values to provide explainable uncertainty estimation.
arXiv Detail & Related papers (2022-01-27T17:23:04Z)
Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters. We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z)
Efficient remedies for outlier detection with variational autoencoders [8.80692072928023]
Likelihoods computed by deep generative models are a candidate metric for outlier detection with unlabeled data. We show that a theoretically-grounded correction readily ameliorates a key bias with VAE likelihood estimates. We also show that the variance of the likelihoods computed over an ensemble of VAEs also enables robust outlier detection.
arXiv Detail & Related papers (2021-08-19T16:00:58Z)
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.