Related papers: Future-as-Label: Scalable Supervision from Real-World Outcomes

Future-as-Label: Scalable Supervision from Real-World Outcomes

URL: http://arxiv.org/abs/2601.06336v2
Date: Wed, 14 Jan 2026 22:14:08 GMT
Title: Future-as-Label: Scalable Supervision from Real-World Outcomes
Authors: Benjamin Turtel, Paul Wilczewski, Danny Franklin, Kris Skothiem,
Abstract summary: Time creates free supervision: forecasts about real-world events resolve to verifiable outcomes.<n>We extend reinforcement learning with verifiable rewards to real-world prediction over time.<n>We train language models to make probabilistic forecasts from causally masked information.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Time creates free supervision: forecasts about real-world events resolve to verifiable outcomes. The passage of time provides labels that require no annotation. To exploit this structure, we extend reinforcement learning with verifiable rewards to real-world prediction over time. We train language models to make probabilistic forecasts from causally masked information, using proper scoring rules as the reward function once events resolve. Learning is driven entirely by realized outcomes, enabling scalable outcome-based supervision in open-world prediction. On real-world forecasting benchmarks, Qwen3-32B trained using Foresight Learning improves Brier score by 27% and halves calibration error relative to its pretrained baseline, and outperforms Qwen3-235B on both constructed future-event prediction tasks and the Metaculus benchmark despite a 7x parameter disadvantage.

Related papers

Scaling Open-Ended Reasoning to Predict the Future [56.672065928345525]
We train language models to make predictions on open-ended forecasting questions.<n>To scale up training data, we synthesize novel forecasting questions from global events reported in daily news.<n>We find calibration improvements from forecasting training generalize across popular benchmarks.
arXiv Detail & Related papers (2025-12-31T18:59:51Z)
Neural CDEs as Correctors for Learned Time Series Models [0.0]
We propose a Predictor-Corrector mechanism where the Predictor is any learned time-series model and the Corrector is a neural controlled differential equation.<n>The proposed Corrector works with irregularly sampled time series and continuous- and discrete-time Predictors.<n>We evaluate our Corrector with diverse Predictors on synthetic, physics simulation, and real-world forecasting datasets.
arXiv Detail & Related papers (2025-12-13T01:17:05Z)
Improving Prediction Certainty Estimation for Reliable Early Exiting via Null Space Projection [16.838728310658105]
We propose a novel early exiting method based on the Certainty-Aware Probability (CAP) score.<n>We show that our method can achieve an average speed-up ratio of 2.19x across all tasks with negligible performance degradation.
arXiv Detail & Related papers (2025-06-08T05:08:34Z)
Consistency Checks for Language Model Forecasters [54.62507816753479]
We measure the performance of forecasters in terms of the consistency of their predictions on different logically-related questions.<n>We build an automated evaluation system that generates a set of base questions, instantiates consistency checks from these questions, elicits predictions of the forecaster, and measures the consistency of the predictions.
arXiv Detail & Related papers (2024-12-24T16:51:35Z)
STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning [11.324029387605888]
We propose an early-temporal forecasting model based on Multi-Objective reinforcement learning. Our method demonstrates superior performance on three large-scale real-world datasets.
arXiv Detail & Related papers (2024-06-06T13:03:51Z)
Loss Shaping Constraints for Long-Term Time Series Forecasting [79.3533114027664]
We present a Constrained Learning approach for long-term time series forecasting that respects a user-defined upper bound on the loss at each time-step. We propose a practical Primal-Dual algorithm to tackle it, and aims to demonstrate that it exhibits competitive average performance in time series benchmarks, while shaping the errors across the predicted window.
arXiv Detail & Related papers (2024-02-14T18:20:44Z)
Performative Time-Series Forecasting [64.03865043422597]
We formalize performative time-series forecasting (PeTS) from a machine-learning perspective.<n>We propose a novel approach, Feature Performative-Shifting (FPS), which leverages the concept of delayed response to anticipate distribution shifts.<n>We conduct comprehensive experiments using multiple time-series models on COVID-19 and traffic forecasting tasks.
arXiv Detail & Related papers (2023-10-09T18:34:29Z)
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain. Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples. In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z)
Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores. We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z)
What Should I Know? Using Meta-gradient Descent for Predictive Feature Discovery in a Single Stream of Experience [63.75363908696257]
computational reinforcement learning seeks to construct an agent's perception of the world through predictions of future sensations. An open challenge in this line of work is determining from the infinitely many predictions that the agent could possibly make which predictions might best support decision-making. We introduce a meta-gradient descent process by which an agent learns what predictions to make, 2) the estimates for its chosen predictions, and 3) how to use those estimates to generate policies that maximize future reward.
arXiv Detail & Related papers (2022-06-13T21:31:06Z)
Learning to Predict Trustworthiness with Steep Slope Loss [69.40817968905495]
We study the problem of predicting trustworthiness on real-world large-scale datasets. We observe that the trustworthiness predictors trained with prior-art loss functions are prone to view both correct predictions and incorrect predictions to be trustworthy. We propose a novel steep slope loss to separate the features w.r.t. correct predictions from the ones w.r.t. incorrect predictions by two slide-like curves that oppose each other.
arXiv Detail & Related papers (2021-09-30T19:19:09Z)
All-Clear Flare Prediction Using Interval-based Time Series Classifiers [0.21028463367241026]
An all-clear flare prediction is a type of solar flare forecasting that puts more emphasis on predicting non-flaring instances. Finding the right balance between avoiding false negatives (misses) and reducing the false positives (false alarms) is often challenging.
arXiv Detail & Related papers (2021-05-03T22:40:05Z)
Learning Prediction Intervals for Model Performance [1.433758865948252]
We propose a method to compute prediction intervals for model performance. We evaluate our approach across a wide range of drift conditions and show substantial improvement over competitive baselines.
arXiv Detail & Related papers (2020-12-15T21:32:03Z)
A framework for predicting, interpreting, and improving Learning Outcomes [0.0]
We develop an Embibe Score Quotient model (ESQ) to predict test scores based on observed academic, behavioral and test-taking features of a student. ESQ can be used to predict the future scoring potential of a student as well as offer personalized learning nudges.
arXiv Detail & Related papers (2020-10-06T11:22:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.