Related papers: LLMs Can Teach Themselves to Better Predict the Future

LLMs Can Teach Themselves to Better Predict the Future

URL: http://arxiv.org/abs/2502.05253v1
Date: Fri, 07 Feb 2025 17:21:16 GMT
Title: LLMs Can Teach Themselves to Better Predict the Future
Authors: Benjamin Turtel, Danny Franklin, Philipp Schoenegger,
Abstract summary: We present an outcome-driven fine-tuning framework that enhances the forecasting capabilities of large language models.<n>We generate pairs of diverse reasoning trajectories and probabilistic forecasts for a set of diverse questions.<n>We then rank pairs of these reasoning traces by their distance to the actual outcomes before fine-tuning the model.
Score: 1.0923877073891446
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present an outcome-driven fine-tuning framework that enhances the forecasting capabilities of large language models (LLMs) without relying on human-curated reasoning samples. Our method leverages model self-play to generate pairs of diverse reasoning trajectories and probabilistic forecasts for a set of diverse questions that resolve after the models' knowledge cutoff date. We then rank pairs of these reasoning traces by their distance to the actual outcomes before fine-tuning the model via Direct Preference Optimization (DPO). On a separate test set, our approach increases prediction accuracy of Phi-4 14B and DeepSeek-R1 14B by between 7--10\% over a base model and a DPO fine-tuned control model with randomized labels, bringing them on par with forecasting capabilities of much larger frontier models like GPT-4o.

Related papers

Uncertainty-Guided Enhancement on Driving Perception System via Foundation Models [37.35848849961951]
We develop a method that leverages foundation models to refine predictions from existing driving perception models. The method demonstrates a 10 to 15 percent improvement in prediction accuracy and reduces the number of queries to the foundation model by 50 percent.
arXiv Detail & Related papers (2024-10-02T00:46:19Z)
Optimal starting point for time series forecasting [1.9937737230710553]
We introduce a novel approach called Optimal Starting Point Time Series Forecast (OSP-TSP) for optimal forecasting. The proposed approach can determine the optimal starting point (OSP) of the time series and then enhance the prediction performances of the base forecasting models. Empirical results indicate that predictions based on the OSP-TSP approach consistently outperform those using the complete time series dataset.
arXiv Detail & Related papers (2024-09-25T11:51:00Z)
Self-Play Preference Optimization for Language Model Alignment [75.83359213697854]
Recent advancements suggest that directly working with preference probabilities can yield a more accurate reflection of human preferences. We propose a self-play-based method for language model alignment, which treats the problem as a constant-sum two-player game. Our approach, dubbed Self-Play Preference Optimization (SPPO), utilizes iterative policy updates to provably approximate the Nash equilibrium.
arXiv Detail & Related papers (2024-05-01T17:59:20Z)
Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models. We present theoretical results on the expected churn between models within the Rashomon set. We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z)
Human Trajectory Forecasting with Explainable Behavioral Uncertainty [63.62824628085961]
Human trajectory forecasting helps to understand and predict human behaviors, enabling applications from social robots to self-driving cars. Model-free methods offer superior prediction accuracy but lack explainability, while model-based methods provide explainability but cannot predict well. We show that BNSP-SFM achieves up to a 50% improvement in prediction accuracy, compared with 11 state-of-the-art methods.
arXiv Detail & Related papers (2023-07-04T16:45:21Z)
A positive feedback method based on F-measure value for Salient Object Detection [1.9249287163937976]
This paper proposes a positive feedback method based on F-measure value for salient object detection (SOD) Our proposed method takes an image to be detected and inputs it into several existing models to obtain their respective prediction maps. Experimental results on five publicly available datasets show that our proposed positive feedback method outperforms the latest 12 methods in five evaluation metrics for saliency map prediction.
arXiv Detail & Related papers (2023-04-28T04:05:13Z)
Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters. EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z)
Predictable MDP Abstraction for Unsupervised Model-Based RL [93.91375268580806]
We propose predictable MDP abstraction (PMA) Instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space. We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches.
arXiv Detail & Related papers (2023-02-08T07:37:51Z)
Uncertainty estimation of pedestrian future trajectory using Bayesian approximation [137.00426219455116]
Under dynamic traffic scenarios, planning based on deterministic predictions is not trustworthy. The authors propose to quantify uncertainty during forecasting using approximation which deterministic approaches fail to capture. The effect of dropout weights and long-term prediction on future state uncertainty has been studied.
arXiv Detail & Related papers (2022-05-04T04:23:38Z)
Feature-weighted Stacking for Nonseasonal Time Series Forecasts: A Case Study of the COVID-19 Epidemic Curves [0.0]
We investigate ensembling techniques in forecasting and examine their potential for use in nonseasonal time-series. We propose using late data fusion, using a stacked ensemble of two forecasting models and two meta-features that prove their predictive power during a preliminary forecasting stage.
arXiv Detail & Related papers (2021-08-19T14:44:46Z)
Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task. 'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature. We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z)
Explainable boosted linear regression for time series forecasting [0.1876920697241348]
Time series forecasting involves collecting and analyzing past observations to develop a model to extrapolate such observations into the future. We propose explainable boosted linear regression (EBLR) algorithm for time series forecasting.
arXiv Detail & Related papers (2020-09-18T22:31:42Z)
Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions. Motivated by these theoretical results, we propose learning several approximate proposals for the best model. In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.