A Synthetic Benchmark to Explore Limitations of Localized Drift Detections
- URL: http://arxiv.org/abs/2408.14687v1
- Date: Mon, 26 Aug 2024 23:24:31 GMT
- Title: A Synthetic Benchmark to Explore Limitations of Localized Drift Detections
- Authors: Flavio Giobergia, Eliana Pastor, Luca de Alfaro, Elena Baralis,
- Abstract summary: Concept drift is a common phenomenon in data streams where the statistical properties of the target variable change over time.
This paper explores the concept of localized drift and evaluates the performance of several drift detection techniques in identifying such localized changes.
- Score: 13.916984628784766
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Concept drift is a common phenomenon in data streams where the statistical properties of the target variable change over time. Traditionally, drift is assumed to occur globally, affecting the entire dataset uniformly. However, this assumption does not always hold true in real-world scenarios where only specific subpopulations within the data may experience drift. This paper explores the concept of localized drift and evaluates the performance of several drift detection techniques in identifying such localized changes. We introduce a synthetic dataset based on the Agrawal generator, where drift is induced in a randomly chosen subgroup. Our experiments demonstrate that commonly adopted drift detection methods may fail to detect drift when it is confined to a small subpopulation. We propose and test various drift detection approaches to quantify their effectiveness in this localized drift scenario. We make the source code for the generation of the synthetic benchmark available at https://github.com/fgiobergia/subgroup-agrawal-drift.
Related papers
- Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time [5.999777817331315]
Concept Drift is a phenomenon in which the underlying data distribution and statistical properties of a target domain change over time.
We propose DriftLens, an unsupervised real-time concept drift detection framework.
It works on unstructured data by exploiting the distribution distances of deep learning representations.
arXiv Detail & Related papers (2024-06-24T23:41:46Z) - Drift Detection: Introducing Gaussian Split Detector [1.9430846345184412]
We introduce Gaussian Split Detector (GSD) a novel drift detector that works in batch mode.
GSD is designed to work when the data follow a normal distribution and makes use of Gaussian mixture models to monitor changes in the decision boundary.
We show that our detector outperforms the state of the art in detecting real drift and in ignoring virtual drift which is key to avoid false alarms.
arXiv Detail & Related papers (2024-05-14T14:15:31Z) - Methods for Generating Drift in Text Streams [49.3179290313959]
Concept drift is a frequent phenomenon in real-world datasets and corresponds to changes in data distribution over time.
This paper provides four textual drift generation methods to ease the production of datasets with labeled drifts.
Results show that all methods have their performance degraded right after the drifts, and the incremental SVM is the fastest to run and recover the previous performance levels.
arXiv Detail & Related papers (2024-03-18T23:48:33Z) - A comprehensive analysis of concept drift locality in data streams [3.5897534810405403]
Concept drift must be detected for effective model adaptation to evolving data properties.
We present a novel categorization of concept drift based on its locality and scale.
We conduct a comparative assessment of 9 state-of-the-art drift detectors across diverse difficulties.
arXiv Detail & Related papers (2023-11-10T20:57:43Z) - MomentDiff: Generative Video Moment Retrieval from Random to Real [71.40038773943638]
We provide a generative diffusion-based framework called MomentDiff.
MomentDiff simulates a typical human retrieval process from random browsing to gradual localization.
We show that MomentDiff consistently outperforms state-of-the-art methods on three public benchmarks.
arXiv Detail & Related papers (2023-07-06T09:12:13Z) - CADM: Confusion Model-based Detection Method for Real-drift in Chunk
Data Stream [3.0885191226198785]
Concept drift detection has attracted considerable attention due to its importance in many real-world applications such as health monitoring and fault diagnosis.
We propose a new approach to detect real-drift in the chunk data stream with limited annotations based on concept confusion.
arXiv Detail & Related papers (2023-03-25T08:59:27Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Change Detection for Local Explainability in Evolving Data Streams [72.4816340552763]
Local feature attribution methods have become a popular technique for post-hoc and model-agnostic explanations.
It is often unclear how local attributions behave in realistic, constantly evolving settings such as streaming and online applications.
We present CDLEEDS, a flexible and model-agnostic framework for detecting local change and concept drift.
arXiv Detail & Related papers (2022-09-06T18:38:34Z) - Delving into Sequential Patches for Deepfake Detection [64.19468088546743]
Recent advances in face forgery techniques produce nearly untraceable deepfake videos, which could be leveraged with malicious intentions.
Previous studies has identified the importance of local low-level cues and temporal information in pursuit to generalize well across deepfake methods.
We propose the Local- & Temporal-aware Transformer-based Deepfake Detection framework, which adopts a local-to-global learning protocol.
arXiv Detail & Related papers (2022-07-06T16:46:30Z) - Task-Sensitive Concept Drift Detector with Metric Learning [7.706795195017394]
We propose a novel task-sensitive drift detection framework, which is able to detect drifts without access to true labels during inference.
It is able to detect real drift, where the drift affects the classification performance, while it properly ignores virtual drift.
We evaluate the performance of the proposed framework with a novel metric, which accumulates the standard metrics of detection accuracy, false positive rate and detection delay into one value.
arXiv Detail & Related papers (2021-08-16T09:10:52Z) - Bayesian Autoencoders for Drift Detection in Industrial Environments [69.93875748095574]
Autoencoders are unsupervised models which have been used for detecting anomalies in multi-sensor environments.
Anomalies can come either from real changes in the environment (real drift) or from faulty sensory devices (virtual drift)
arXiv Detail & Related papers (2021-07-28T10:19:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.