A Comprehensive Survey on Rare Event Prediction
- URL: http://arxiv.org/abs/2309.11356v2
- Date: Sat, 05 Oct 2024 13:12:26 GMT
- Title: A Comprehensive Survey on Rare Event Prediction
- Authors: Chathurangi Shyalika, Ruwan Wickramarachchi, Amit Sheth,
- Abstract summary: Rare event prediction involves identifying and forecasting events with a low probability using machine learning (ML) and data analysis.
This paper aims to identify gaps in the current literature and highlight the challenges of predicting rare events.
- Score: 1.6385815610837167
- License:
- Abstract: Rare event prediction involves identifying and forecasting events with a low probability using machine learning (ML) and data analysis. Due to the imbalanced data distributions, where the frequency of common events vastly outweighs that of rare events, it requires using specialized methods within each step of the ML pipeline, i.e., from data processing to algorithms to evaluation protocols. Predicting the occurrences of rare events is important for real-world applications, such as Industry 4.0, and is an active research area in statistical and ML. This paper comprehensively reviews the current approaches for rare event prediction along four dimensions: rare event data, data processing, algorithmic approaches, and evaluation approaches. Specifically, we consider 73 datasets from different modalities (i.e., numerical, image, text, and audio), four major categories of data processing, five major algorithmic groupings, and two broader evaluation approaches. This paper aims to identify gaps in the current literature and highlight the challenges of predicting rare events. It also suggests potential research directions, which can help guide practitioners and researchers.
Related papers
- Evaluating the Role of Data Enrichment Approaches Towards Rare Event Analysis in Manufacturing [1.3980986259786223]
Rare events are occurrences that take place with a significantly lower frequency than more common regular events.
In manufacturing, predicting such events is particularly important, as they lead to unplanned downtime, shortening equipment lifespan, and high energy consumption.
This paper evaluates the role of data enrichment techniques combined with supervised machine-learning techniques for rare event detection and prediction.
arXiv Detail & Related papers (2024-07-01T00:05:56Z) - Event Detection in Time Series: Universal Deep Learning Approach [0.0]
Event detection in time series is a challenging task due to the prevalence of imbalanced datasets, rare events, and time interval-defined events.
We propose a novel supervised regression-based deep learning approach that offers several advantages over classification-based methods.
Our approach can effectively handle various types of events within a unified framework, including rare events and imbalanced datasets.
arXiv Detail & Related papers (2023-11-27T09:33:56Z) - Towards Out-of-Distribution Sequential Event Prediction: A Causal
Treatment [72.50906475214457]
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events.
In practice, the next-event prediction models are trained with sequential data collected at one time.
We propose a framework with hierarchical branching structures for learning context-specific representations.
arXiv Detail & Related papers (2022-10-24T07:54:13Z) - Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem.
We examine the performance of various debiasing methods across multiple tasks.
We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z) - Predicting Seriousness of Injury in a Traffic Accident: A New Imbalanced
Dataset and Benchmark [62.997667081978825]
The paper introduces a new dataset to assess the performance of machine learning algorithms in the prediction of the seriousness of injury in a traffic accident.
The dataset is created by aggregating publicly available datasets from the UK Department for Transport.
arXiv Detail & Related papers (2022-05-20T21:15:26Z) - Robust Event Classification Using Imperfect Real-world PMU Data [58.26737360525643]
We study robust event classification using imperfect real-world phasor measurement unit (PMU) data.
We develop a novel machine learning framework for training robust event classifiers.
arXiv Detail & Related papers (2021-10-19T17:41:43Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Unsupervised Event Detection, Clustering, and Use Case Exposition in
Micro-PMU Measurements [0.0]
We develop an unsupervised event detection method based on the concept of Generative Adversarial Networks (GAN)
We also propose a two-step unsupervised clustering method, based on a novel linear mixed integer programming formulation.
Results show that they can outperform the prevalent methods in the literature.
arXiv Detail & Related papers (2020-07-30T05:20:29Z) - Event Prediction in the Big Data Era: A Systematic Survey [7.3810864598379755]
Event prediction is becoming a viable option in the big data era.
This paper aims to provide a systematic and comprehensive survey of the technologies, applications, and evaluations of event prediction.
arXiv Detail & Related papers (2020-07-19T23:24:52Z) - MAVEN: A Massive General Domain Event Detection Dataset [56.00401399384715]
Event detection (ED) is the first and most fundamental step for extracting event knowledge from plain text.
Existing datasets exhibit issues that limit further development of ED.
We present a MAssive eVENt detection dataset (MAVEN), which contains 4,480 Wikipedia documents, 118,732 event mention instances, and 168 event types.
arXiv Detail & Related papers (2020-04-28T15:25:19Z) - Multi-label Prediction in Time Series Data using Deep Neural Networks [19.950094635430048]
This paper addresses a multi-label predictive fault classification problem for multidimensional time-series data.
The proposed algorithm is tested on two public benchmark datasets.
arXiv Detail & Related papers (2020-01-27T21:35:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.