AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval
- URL: http://arxiv.org/abs/2310.01880v2
- Date: Thu, 18 Apr 2024 19:41:23 GMT
- Title: AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval
- Authors: Qi Yan, Raihan Seraj, Jiawei He, Lili Meng, Tristan Sylvain,
- Abstract summary: We introduce AutoCast++, a zero-shot ranking-based context retrieval system.
Our approach first re-ranks articles based on zero-shot question-passage relevance, honing in on semantically pertinent news.
We conduct both the relevance evaluation and article summarization without needing domain-specific training.
- Score: 9.357912396498142
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine-based prediction of real-world events is garnering attention due to its potential for informed decision-making. Whereas traditional forecasting predominantly hinges on structured data like time-series, recent breakthroughs in language models enable predictions using unstructured text. In particular, (Zou et al., 2022) unveils AutoCast, a new benchmark that employs news articles for answering forecasting queries. Nevertheless, existing methods still trail behind human performance. The cornerstone of accurate forecasting, we argue, lies in identifying a concise, yet rich subset of news snippets from a vast corpus. With this motivation, we introduce AutoCast++, a zero-shot ranking-based context retrieval system, tailored to sift through expansive news document collections for event forecasting. Our approach first re-ranks articles based on zero-shot question-passage relevance, honing in on semantically pertinent news. Following this, the chosen articles are subjected to zero-shot summarization to attain succinct context. Leveraging a pre-trained language model, we conduct both the relevance evaluation and article summarization without needing domain-specific training. Notably, recent articles can sometimes be at odds with preceding ones due to new facts or unanticipated incidents, leading to fluctuating temporal dynamics. To tackle this, our re-ranking mechanism gives preference to more recent articles, and we further regularize the multi-passage representation learning to align with human forecaster responses made on different dates. Empirical results underscore marked improvements across multiple metrics, improving the performance for multiple-choice questions (MCQ) by 48% and true/false (TF) questions by up to 8%. Code is available at https://github.com/BorealisAI/Autocast-plus-plus.
Related papers
- Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.
We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.
Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - From News to Forecast: Integrating Event Analysis in LLM-Based Time Series Forecasting with Reflection [16.47323362700347]
We introduce a novel approach to enhance time series forecasting by reasoning across both text and time series data.
With language as a medium, our method adaptively integrates social events into forecasting models, aligning news content with time series fluctuations to provide richer insights.
Specifically, we utilize LLM-based agents to iteratively filter out irrelevant news and employ human-like reasoning to evaluate predictions.
arXiv Detail & Related papers (2024-09-26T03:50:22Z) - Posterior Sampling via Autoregressive Generation [11.713451719120707]
We propose a new framework for learning bandit algorithms from massive historical data.
We use historical data to pretrain an autoregressive model to predict a sequence of repeated feedback/rewards.
At decision-time, we autoregressively sample (impute) an imagined sequence of rewards for each action, and choose the action with the largest average imputed reward.
arXiv Detail & Related papers (2024-05-29T19:24:44Z) - Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation
of the Reversal Curse [73.65112477688353]
Recent studies have highlighted a phenomenon in large language models known as "the reversal curse"
We contend that the reversal curse is partially a result of specific model training objectives.
We propose a novel training method, BI Casual language modeling Optimization (BICO), designed to mitigate the reversal curse.
arXiv Detail & Related papers (2023-11-13T17:01:12Z) - A Generative Approach for Script Event Prediction via Contrastive
Fine-tuning [35.87615178251874]
Script event prediction aims to predict the subsequent event given the context.
Recent works have attempted to improve event correlation reasoning by using pretrained language models and incorporating external knowledge.
We propose a novel generative approach for this task, in which a pretrained language model is fine-tuned with an event-centric pretraining objective.
arXiv Detail & Related papers (2022-12-07T07:32:47Z) - PromptCast: A New Prompt-based Learning Paradigm for Time Series
Forecasting [11.670324826998968]
In existing time series forecasting methods, the models take a sequence of numerical values as input and yield numerical values as output.
Inspired by the successes of pre-trained language foundation models, we propose a new forecasting paradigm: prompt-based time series forecasting.
In this novel task, the numerical input and output are transformed into prompts and the forecasting task is framed in a sentence-to-sentence manner.
arXiv Detail & Related papers (2022-09-20T10:15:35Z) - Forecasting Future World Events with Neural Networks [68.43460909545063]
Autocast is a dataset containing thousands of forecasting questions and an accompanying news corpus.
The news corpus is organized by date, allowing us to precisely simulate the conditions under which humans made past forecasts.
We test language models on our forecasting task and find that performance is far below a human expert baseline.
arXiv Detail & Related papers (2022-06-30T17:59:14Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - Complex Event Forecasting with Prediction Suffix Trees: Extended
Technical Report [70.7321040534471]
Complex Event Recognition (CER) systems have become popular in the past two decades due to their ability to "instantly" detect patterns on real-time streams of events.
There is a lack of methods for forecasting when a pattern might occur before such an occurrence is actually detected by a CER engine.
We present a formal framework that attempts to address the issue of Complex Event Forecasting.
arXiv Detail & Related papers (2021-09-01T09:52:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.