Reasoning-Enhanced Rare-Event Prediction with Balanced Outcome Correction
- URL: http://arxiv.org/abs/2601.16406v1
- Date: Fri, 23 Jan 2026 02:34:29 GMT
- Title: Reasoning-Enhanced Rare-Event Prediction with Balanced Outcome Correction
- Authors: Vitaly Bulgakov, Alexander Turchin,
- Abstract summary: We propose LPCORP (Low-Prevalence CORrector for Prediction)*, a two-stage framework that combines reasoningenhanced prediction with confidence-based outcome correction.<n>We evaluate LPCORP on real-world datasets from medical and consumer service domains.
- Score: 45.88028371034407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rare-event prediction is critical in domains such as healthcare, finance, reliability engineering, customer support, aviation safety, where positive outcomes are infrequent yet potentially catastrophic. Extreme class imbalance biases conventional models toward majority-class predictions, limiting recall, calibration, and operational usefulness. We propose LPCORP (Low-Prevalence CORrector for Prediction)*, a two-stage framework that combines reasoningenhanced prediction with confidence-based outcome correction. A reasoning model first produces enriched predictions from narrative inputs, after which a lightweight logistic-regression classifier evaluates and selectively corrects these outputs to mitigate prevalence-driven bias. We evaluate LPCORP on real-world datasets from medical and consumer service domains. The results show that this method transforms a highly imbalanced setting into a well-balanced one while preserving the original number of samples and without applying any resampling strategies. Test-set evaluation demonstrates substantially improved performance, particularly in precision, which is a known weakness in low-prevalence data. We further provide a costreduction analysis comparing the expenses associated with rare-event damage control without preventive measures to those incurred when low-cost, prediction-based preventive interventions are applied that showed more than 50% reduction in some cases. * Patent pending: U.S. Provisional 63/933,518, filed 8 December 2025.
Related papers
- Observationally Informed Adaptive Causal Experimental Design [55.998153710215654]
We propose Active Residual Learning, a new paradigm that leverages the observational model as a foundational prior.<n>This approach shifts the experimental focus from learning target causal quantities from scratch to efficiently estimating the residuals required to correct observational bias.<n> Experiments on synthetic and semi-synthetic benchmarks demonstrate that R-Design significantly outperforms baselines.
arXiv Detail & Related papers (2026-03-04T06:52:37Z) - Adaptive-CaRe: Adaptive Causal Regularization for Robust Outcome Prediction [16.391352325575763]
Supervised machine learning algorithms are commonly used for outcome prediction in the medical domain.<n>We propose a novel model-agnostic regularization strategy, Adaptive-CaRe, for generalized outcome prediction in the medical domain.
arXiv Detail & Related papers (2026-02-06T11:14:03Z) - Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.<n>Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.<n>We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z) - Achieving Fairness in Predictive Process Analytics via Adversarial Learning [50.31323204077591]
This paper addresses the challenge of integrating a debiasing phase into predictive business process analytics.
Our framework leverages on adversial debiasing is evaluated on four case studies, showing a significant reduction in the contribution of biased variables to the predicted value.
arXiv Detail & Related papers (2024-10-03T15:56:03Z) - Imputation for prediction: beware of diminishing returns [12.424671213282256]
Missing values are prevalent across various fields, posing challenges for training and deploying predictive models.<n>Recent theoretical and empirical studies indicate that simple constant imputation can be consistent and competitive.<n>This study aims at clarifying if and when investing in advanced imputation methods yields significantly better predictions.
arXiv Detail & Related papers (2024-07-29T09:01:06Z) - Uncertainty Calibration for Counterfactual Propensity Estimation in Recommendation [22.67361489565711]
Post-click conversion rate (CVR) is a reliable indicator of online customers' preferences.<n>We introduce a model-agnostic calibration framework for propensity-based debiasing of CVR predictions.
arXiv Detail & Related papers (2023-03-23T00:42:48Z) - Learning to Predict Trustworthiness with Steep Slope Loss [69.40817968905495]
We study the problem of predicting trustworthiness on real-world large-scale datasets.
We observe that the trustworthiness predictors trained with prior-art loss functions are prone to view both correct predictions and incorrect predictions to be trustworthy.
We propose a novel steep slope loss to separate the features w.r.t. correct predictions from the ones w.r.t. incorrect predictions by two slide-like curves that oppose each other.
arXiv Detail & Related papers (2021-09-30T19:19:09Z) - Uncertainty-Aware Training for Cardiac Resynchronisation Therapy
Response Prediction [3.090173647095682]
Quantifying uncertainty of a prediction is one way to provide such interpretability and promote trust.
We quantify the data (aleatoric) and model (epistemic) uncertainty of a DL model for Cardiac Resynchronisation Therapy response prediction from cardiac magnetic resonance images.
We perform a preliminary investigation of an uncertainty-aware loss function that can be used to retrain an existing DL image-based classification model to encourage confidence in correct predictions.
arXiv Detail & Related papers (2021-09-22T10:37:50Z) - Prediction-Coherent LSTM-based Recurrent Neural Network for Safer
Glucose Predictions in Diabetic People [4.692400531340393]
We propose a LSTM-based recurrent neural network architecture and loss function that enhance the stability of predictions.
The study is conducted on type 1 and type 2 diabetic people, with a focus on predictions made 30-minutes ahead of time.
arXiv Detail & Related papers (2020-09-08T13:14:08Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.