CAAL: Confidence-Aware Active Learning for Heteroscedastic Atmospheric Regression
- URL: http://arxiv.org/abs/2602.11825v1
- Date: Thu, 12 Feb 2026 11:09:58 GMT
- Title: CAAL: Confidence-Aware Active Learning for Heteroscedastic Atmospheric Regression
- Authors: Fei Jiang, Jiyang Xia, Junjie Yu, Mingfei Sun, Hugh Coe, David Topping, Dantong Liu, Zhenhui Jessie Li, Zhonghua Zheng,
- Abstract summary: Quantifying the impacts of air pollution on health and climate relies on key atmospheric particle properties such as toxicity and hygroscopicity.<n>These properties typically require complex observational techniques or expensive particle-resolved numerical simulations.<n>We propose a confidence-aware active learning framework (CAAL) for efficient and robust sample selection.
- Score: 7.951744148676244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Quantifying the impacts of air pollution on health and climate relies on key atmospheric particle properties such as toxicity and hygroscopicity. However, these properties typically require complex observational techniques or expensive particle-resolved numerical simulations, limiting the availability of labeled data. We therefore estimate these hard-to-measure particle properties from routinely available observations (e.g., air pollutant concentrations and meteorological conditions). Because routine observations only indirectly reflect particle composition and structure, the mapping from routine observations to particle properties is noisy and input-dependent, yielding a heteroscedastic regression setting. With a limited and costly labeling budget, the central challenge is to select which samples to measure or simulate. While active learning is a natural approach, most acquisition strategies rely on predictive uncertainty. Under heteroscedastic noise, this signal conflates reducible epistemic uncertainty with irreducible aleatoric uncertainty, causing limited budgets to be wasted in noise-dominated regions. To address this challenge, we propose a confidence-aware active learning framework (CAAL) for efficient and robust sample selection in heteroscedastic settings. CAAL consists of two components: a decoupled uncertainty-aware training objective that separately optimises the predictive mean and noise level to stabilise uncertainty estimation, and a confidence-aware acquisition function that dynamically weights epistemic uncertainty using predicted aleatoric uncertainty as a reliability signal. Experiments on particle-resolved numerical simulations and real atmospheric observations show that CAAL consistently outperforms standard AL baselines. The proposed framework provides a practical and general solution for the efficient expansion of high-cost atmospheric particle property databases.
Related papers
- Learning Complex Physical Regimes via Coverage-oriented Uncertainty Quantification: An application to the Critical Heat Flux [0.0]
Uncertainty quantification (UQ) should not be viewed as a safety assessment, but as a support to the learning task itself.<n>We focus on the Critical Heat Flux benchmark and dataset presented by the OECD/NEA Expert Group on Reactor Systems Multi-Physics.<n>We show that while post-hoc methods ensure statistical calibration, coverage-oriented learning effectively reshapes the model's representation to match the complex physical regimes.
arXiv Detail & Related papers (2026-02-25T09:04:15Z) - Equivariant Evidential Deep Learning for Interatomic Potentials [55.6997213490859]
Uncertainty quantification is critical for assessing the reliability of machine learning interatomic potentials in molecular dynamics simulations.<n>Existing UQ approaches for MLIPs are often limited by high computational cost or suboptimal performance.<n>We propose textitEquivariant Evidential Deep Learning for Interatomic Potentials ($texte2$IP), a backbone-agnostic framework that models atomic forces and their uncertainty jointly.
arXiv Detail & Related papers (2026-02-11T02:00:25Z) - Noise-Robust Tiny Object Localization with Flows [63.60972031108944]
We propose a noise-robust localization framework leveraging normalizing flows for flexible error modeling and uncertainty-guided optimization.<n>Our method captures complex, non-Gaussian prediction distributions through flow-based error modeling, enabling robust learning under noisy supervision.<n>An uncertainty-aware gradient modulation mechanism further suppresses learning from high-uncertainty, noise-prone samples, mitigating overfitting while stabilizing training.
arXiv Detail & Related papers (2026-01-02T09:16:55Z) - Deep classifier kriging for probabilistic spatial prediction of air quality index [16.289713160499385]
textitdeep classifier kriging (DCK) is a flexible, distribution-free deep learning framework for estimating full predictive distribution functions.<n>We show that DCK consistently outperforms conventional approaches in predictive accuracy and uncertainty quantification.
arXiv Detail & Related papers (2025-12-29T13:58:34Z) - Calibrating Geophysical Predictions under Constrained Probabilistic Distributions [4.760743517243988]
We introduce a calibration algorithm based on normalization and the Kernelized Stein Discrepancy (KSD) to enhance machine learning predictions.<n>This not only sharpens pointwise predictions but also enforces consistency with non-local statistical structures rooted in physical principles.
arXiv Detail & Related papers (2025-11-28T07:15:40Z) - Penalized Empirical Likelihood for Doubly Robust Causal Inference under Contamination in High Dimensions [0.720409153108429]
We propose a doubly robust estimator for the average treatment effect in low sample size equations.<n>We show that the proposed confidence interval remain efficient compared to those competing estimates.
arXiv Detail & Related papers (2025-07-23T11:58:54Z) - Discovering Governing Equations in the Presence of Uncertainty [11.752763800308276]
In this work, we theorize that accounting for system variability together with measurement noise is the key to consistently discover the governing equations underlying dynamical systems.<n>We show that SIP consistently identifies the correct equations by an average of 82% relative to the Sparse Identification Dynamics (SINDy) approach and its variant.
arXiv Detail & Related papers (2025-07-13T18:31:25Z) - Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Association [15.706262708190643]
We show that existing methods for constructing confidence intervals for associations can fail to provide nominal coverage in the face of model misspecification and nonrandom locations.<n>We introduce a method that constructs valid frequentist confidence intervals for associations in spatial settings.<n>Our approach is the first to guarantee nominal coverage in this setting and outperforms existing techniques in both real and simulated experiments.
arXiv Detail & Related papers (2025-02-09T23:20:03Z) - Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density [93.32594873253534]
Trustworthy machine learning requires meticulous regulation of model reliance on non-robust features.
We propose a framework to delineate and regulate such features by attributing model predictions to the input.
arXiv Detail & Related papers (2024-07-05T09:16:56Z) - Score Matching-based Pseudolikelihood Estimation of Neural Marked
Spatio-Temporal Point Process with Uncertainty Quantification [59.81904428056924]
We introduce SMASH: a Score MAtching estimator for learning markedPs with uncertainty quantification.
Specifically, our framework adopts a normalization-free objective by estimating the pseudolikelihood of markedPs through score-matching.
The superior performance of our proposed framework is demonstrated through extensive experiments in both event prediction and uncertainty quantification.
arXiv Detail & Related papers (2023-10-25T02:37:51Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Localization Uncertainty Estimation for Anchor-Free Object Detection [48.931731695431374]
There are several limitations of the existing uncertainty estimation methods for anchor-based object detection.
We propose a new localization uncertainty estimation method called UAD for anchor-free object detection.
Our method captures the uncertainty in four directions of box offsets that are homogeneous, so that it can tell which direction is uncertain.
arXiv Detail & Related papers (2020-06-28T13:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.