Related papers: Omitted Variable Bias in Language Models Under Distribution Shift

Omitted Variable Bias in Language Models Under Distribution Shift

URL: http://arxiv.org/abs/2602.16784v1
Date: Wed, 18 Feb 2026 19:00:05 GMT
Title: Omitted Variable Bias in Language Models Under Distribution Shift
Authors: Victoria Lin, Louis-Philippe Morency, Eli Ben-Michael,
Abstract summary: We show how distribution shifts in language models can be separated into observable and unobservable components.<n>We introduce a framework that maps the strength of the omitted variables to bounds on the worst-case generalization performance of language models.
Score: 22.663393629883206
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite their impressive performance on a wide variety of tasks, modern language models remain susceptible to distribution shifts, exhibiting brittle behavior when evaluated on data that differs in distribution from their training data. In this paper, we describe how distribution shifts in language models can be separated into observable and unobservable components, and we discuss how established approaches for dealing with distribution shift address only the former. Importantly, we identify that the resulting omitted variable bias from unobserved variables can compromise both evaluation and optimization in language models. To address this challenge, we introduce a framework that maps the strength of the omitted variables to bounds on the worst-case generalization performance of language models under distribution shift. In empirical experiments, we show that using these bounds directly in language model evaluation and optimization provides more principled measures of out-of-distribution performance, improves true out-of-distribution performance relative to standard distribution shift adjustment methods, and further enables inference about the strength of the omitted variables when target distribution labels are available.

Related papers

Variational Reasoning for Language Models [93.08197299751197]
We introduce a variational reasoning framework for language models that treats thinking traces as latent variables.<n>We show that rejection sampling finetuning and binary-reward RL, including GRPO, can be interpreted as local forward-KL objectives.
arXiv Detail & Related papers (2025-09-26T17:58:10Z)
Cross-modality debiasing: using language to mitigate sub-population shifts in imaging [28.88097536026781]
Sub-population shift accounts for a significant source of algorithmic bias and calls for distributional robustness. Recent studies found inherent distributional robustness in multi-modality foundation models, such as the vision-language model CLIP. We propose leveraging natural language inputs to debias the image feature representations, to improve worst-case performance on sub-populations.
arXiv Detail & Related papers (2024-02-02T18:54:48Z)
"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts [17.381178048938068]
We introduce the problem of attributing performance differences between environments to distribution shifts in the underlying data generating mechanisms. We derive an importance weighting method for computing the value of an arbitrary set of distributions. We demonstrate the correctness and utility of our method on synthetic, semi-synthetic, and real-world case studies.
arXiv Detail & Related papers (2022-10-19T17:58:09Z)
Predicting Out-of-Domain Generalization with Neighborhood Invariance [59.05399533508682]
We propose a measure of a classifier's output invariance in a local transformation neighborhood. Our measure is simple to calculate, does not depend on the test point's true label, and can be applied even in out-of-domain (OOD) settings. In experiments on benchmarks in image classification, sentiment analysis, and natural language inference, we demonstrate a strong and robust correlation between our measure and actual OOD generalization.
arXiv Detail & Related papers (2022-07-05T14:55:16Z)
Certifying Model Accuracy under Distribution Shifts [151.67113334248464]
We present provable robustness guarantees on the accuracy of a model under bounded Wasserstein shifts of the data distribution. We show that a simple procedure that randomizes the input of the model within a transformation space is provably robust to distributional shifts under the transformation.
arXiv Detail & Related papers (2022-01-28T22:03:50Z)
Predicting with Confidence on Unseen Distributions [90.68414180153897]
We connect domain adaptation and predictive uncertainty literature to predict model accuracy on challenging unseen distributions. We find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts. We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference.
arXiv Detail & Related papers (2021-07-07T15:50:18Z)
The Case for Translation-Invariant Self-Attention in Transformer-Based Language Models [11.148662334602639]
We analyze the position embeddings of existing language models and find strong evidence of translation invariance. We propose translation-invariant self-attention (TISA), which accounts for the relative position between tokens in an interpretable fashion.
arXiv Detail & Related papers (2021-06-03T15:56:26Z)
Estimating Generalization under Distribution Shifts via Domain-Invariant Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels. The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z)
Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance [70.31427277842239]
We introduce a novel debiasing method called confidence regularization. It discourages models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples. We evaluate our method on three NLU tasks and show that, in contrast to its predecessors, it improves the performance on out-of-distribution datasets.
arXiv Detail & Related papers (2020-05-01T11:22:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.