ForecastQA: A Question Answering Challenge for Event Forecasting with
Temporal Text Data
- URL: http://arxiv.org/abs/2005.00792v4
- Date: Tue, 8 Jun 2021 02:54:15 GMT
- Title: ForecastQA: A Question Answering Challenge for Event Forecasting with
Temporal Text Data
- Authors: Woojeong Jin, Rahul Khanna, Suji Kim, Dong-Ho Lee, Fred Morstatter,
Aram Galstyan, Xiang Ren
- Abstract summary: Event forecasting is a challenging, yet important task, as humans seek to constantly plan for the future.
We formulate a task, construct a dataset, and provide benchmarks for developing methods for event forecasting with large volumes of unstructured text data.
We present our experiments on ForecastQA using BERT-based models and find that our best model achieves 60.1% accuracy on the dataset, which still lags behind human performance by about 19%.
- Score: 43.400630267599084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Event forecasting is a challenging, yet important task, as humans seek to
constantly plan for the future. Existing automated forecasting studies rely
mostly on structured data, such as time-series or event-based knowledge graphs,
to help predict future events. In this work, we aim to formulate a task,
construct a dataset, and provide benchmarks for developing methods for event
forecasting with large volumes of unstructured text data. To simulate the
forecasting scenario on temporal news documents, we formulate the problem as a
restricted-domain, multiple-choice, question-answering (QA) task. Unlike
existing QA tasks, our task limits accessible information, and thus a model has
to make a forecasting judgement. To showcase the usefulness of this task
formulation, we introduce ForecastQA, a question-answering dataset consisting
of 10,392 event forecasting questions, which have been collected and verified
via crowdsourcing efforts. We present our experiments on ForecastQA using
BERT-based models and find that our best model achieves 60.1% accuracy on the
dataset, which still lags behind human performance by about 19%. We hope
ForecastQA will support future research efforts in bridging this gap.
Related papers
- One VLM to Keep it Learning: Generation and Balancing for Data-free Continual Visual Question Answering [31.025439143093585]
Vision-Language Models (VLMs) have shown significant promise in Visual Question Answering (VQA) tasks by leveraging web-scale multimodal datasets.
These models often struggle with continual learning due to catastrophic forgetting when adapting to new tasks.
We propose the first data-free method that leverages the language generation capability of a VLM, instead of relying on external models.
arXiv Detail & Related papers (2024-11-04T16:04:59Z) - GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation [90.53485251837235]
GIFT-Eval is a pioneering benchmark aimed at promoting evaluation across diverse datasets.
GIFT-Eval encompasses 28 datasets over 144,000 time series and 177 million data points.
We also provide a non-leaking pretraining dataset containing approximately 230 billion data points.
arXiv Detail & Related papers (2024-10-14T11:29:38Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - SCTc-TE: A Comprehensive Formulation and Benchmark for Temporal Event Forecasting [63.01035584154509]
We develop a fully automated pipeline and construct a large-scale dataset named MidEast-TE from about 0.6 million news articles.
This dataset focuses on the cooperation and conflict events among countries mainly in the MidEast region from 2015 to 2022.
We propose a novel method LoGo that is able to take advantage of both Local and Global contexts for SCTc-TE forecasting.
arXiv Detail & Related papers (2023-12-02T07:40:21Z) - Forecasting Future World Events with Neural Networks [68.43460909545063]
Autocast is a dataset containing thousands of forecasting questions and an accompanying news corpus.
The news corpus is organized by date, allowing us to precisely simulate the conditions under which humans made past forecasts.
We test language models on our forecasting task and find that performance is far below a human expert baseline.
arXiv Detail & Related papers (2022-06-30T17:59:14Z) - CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training [21.07506671340319]
We propose a novel question-answering dataset based on the Common Crawl project in this paper.
We extract around 130 million multilingual question-answer pairs, including about 60 million English data-points.
With this previously unseen number of natural QA pairs, we pre-train popular language models to show the potential of large-scale in-domain pre-training for the task of question-answering.
arXiv Detail & Related papers (2021-10-14T21:23:01Z) - Few-shot Learning for Time-series Forecasting [40.58524521473793]
We propose a few-shot learning method that forecasts a future value of a time-series in a target task given a few time-series in the target task.
Our model is trained using time-series data in multiple training tasks that are different from target tasks.
arXiv Detail & Related papers (2020-09-30T01:32:22Z) - Template-Based Question Generation from Retrieved Sentences for Improved
Unsupervised Question Answering [98.48363619128108]
We propose an unsupervised approach to training QA models with generated pseudo-training data.
We show that generating questions for QA training by applying a simple template on a related, retrieved sentence rather than the original context sentence improves downstream QA performance.
arXiv Detail & Related papers (2020-04-24T17:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.