Fugu-MT 論文翻訳(概要): Aligning Validation with Deployment: Target-Weighted Cross-Validation for Spatial Prediction

論文の概要: Aligning Validation with Deployment: Target-Weighted Cross-Validation for Spatial Prediction

arxiv url: http://arxiv.org/abs/2603.29981v1
Date: Tue, 31 Mar 2026 16:44:07 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-01 15:25:03.868581
Title: Aligning Validation with Deployment: Target-Weighted Cross-Validation for Spatial Prediction
Title（参考訳）: 配置による適応検証:空間予測のための目標重み付きクロスバリデーション
Authors: Alexander Brenning, Thomas Suesse,
Abstract要約: クロスバリデーション(CV)は、独立したテストデータが利用できない場合の予測リスクを推定するために一般的に使用される。空間予測や構造化データによる他の設定では、この仮定は頻繁に違反され、デプロイメントリスクのバイアスのある見積もりにつながります。本稿では,検証とデプロイメントタスク分布の相違を考慮に入れた,デプロイメントリスクの推定ツールであるTarget-Weighted CVを提案する。
参考スコア（独自算出の注目度）: 45.94145742195786
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Cross-validation (CV) is commonly used to estimate predictive risk when independent test data are unavailable. Its validity depends on the assumption that validation tasks are sampled from the same distribution as prediction tasks encountered during deployment. In spatial prediction and other settings with structured data, this assumption is frequently violated, leading to biased estimates of deployment risk. We propose Target-Weighted CV (TWCV), an estimator of deployment risk that accounts for discrepancies between validation and deployment task distributions, thus accounting for (1) covariate shift and (2) task-difficulty shift. We characterize prediction tasks by descriptors such as covariates and spatial configuration. TWCV assigns weights to validation losses such that the weighted empirical distribution of validation tasks matches the corresponding distribution over a target domain. The weights are obtained via calibration weighting, yielding an importance-weighted estimator that targets deployment risk. Since TWCV requires adequate coverage of the deployment distribution's support, we combine it with spatially buffered resampling that diversifies the task difficulty distribution. In a simulation study, conventional as well as spatial estimators exhibit substantial bias depending on sampling, whereas buffered TWCV remains approximately unbiased across scenarios. A case study in environmental pollution mapping further confirms that discrepancies between validation and deployment task distributions can affect performance assessment, and that buffered TWCV better reflects the prediction task over the target domain. These results establish task distribution mismatch as a primary source of CV bias in spatial prediction and show that calibration weighting combined with a suitable validation task generator provides a viable approach to estimating predictive risk under dataset shift.
Abstract（参考訳）: クロスバリデーション(CV)は、独立したテストデータが利用できない場合の予測リスクを推定するために一般的に使用される。その妥当性は、検証タスクがデプロイメント中に遭遇する予測タスクと同じ分布からサンプリングされるという仮定に依存する。空間予測や構造化データによる他の設定では、この仮定は頻繁に違反され、デプロイメントリスクのバイアスのある見積もりにつながります。本稿では,(1)共変量シフトと(2)タスク差シフトを考慮し,検証とデプロイメントタスク分布の相違を考慮に入れた,デプロイメントリスクの推定手法であるTWCVを提案する。共変量や空間構成などの記述子による予測タスクの特徴付けを行う。 TWCVは、重み付けされた検証タスクの実験的分布が対象領域上の対応する分布と一致するような検証損失に重みを割り当てる。重み付けはキャリブレーション重み付けによって得られ、デプロイリスクを目標とする重み付け推定器が得られる。 TWCVは、デプロイメントの配信支援を適切にカバーする必要があるため、タスクの難易度分布を多様化する空間的にバッファリングされた再サンプリングと組み合わせる。シミュレーション実験では,従来の推定値と空間推定値の差はサンプリングによって大きく異なるが,バッファリングされたTWCVはシナリオ間でほぼ偏りが保たれている。環境汚染マッピングにおけるケーススタディでは、検証とデプロイメントタスクの分布の相違が性能評価に影響を及ぼし、バッファリングされたTWCVが目標領域上の予測タスクを反映していることが確認されている。これらの結果は,空間的予測におけるCVバイアスの一次源としてタスク分布ミスマッチを確立し,キャリブレーション重み付けと適切な検証タスクジェネレータを組み合わせることで,データセットシフト時の予測リスクを推定できることを示す。

論文の概要: Aligning Validation with Deployment: Target-Weighted Cross-Validation for Spatial Prediction

関連論文リスト