A Remark on Concept Drift for Dependent Data
- URL: http://arxiv.org/abs/2312.10212v1
- Date: Fri, 15 Dec 2023 21:11:46 GMT
- Title: A Remark on Concept Drift for Dependent Data
- Authors: Fabian Hinder, Valerie Vaquet, Barbara Hammer
- Abstract summary: We show that the temporal dependencies are strongly influencing the sampling process.
In particular, we show that the notion of stationarity is not suited for this setup and discuss alternatives.
- Score: 7.0072935721154614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Concept drift, i.e., the change of the data generating distribution, can
render machine learning models inaccurate. Several works address the phenomenon
of concept drift in the streaming context usually assuming that consecutive
data points are independent of each other. To generalize to dependent data,
many authors link the notion of concept drift to time series. In this work, we
show that the temporal dependencies are strongly influencing the sampling
process. Thus, the used definitions need major modifications. In particular, we
show that the notion of stationarity is not suited for this setup and discuss
alternatives. We demonstrate that these alternative formal notions describe the
observable learning behavior in numerical experiments.
Related papers
- Online Drift Detection with Maximum Concept Discrepancy [13.48123472458282]
We propose MCD-DD, a novel concept drift detection method based on maximum concept discrepancy.
Our method can adaptively identify varying forms of concept drift by contrastive learning of concept embeddings.
arXiv Detail & Related papers (2024-07-07T13:57:50Z) - Concept Drift Visualization of SVM with Shifting Window [0.0]
In machine learning, concept drift is an evolution of information that invalidates the current data model.
We propose a novel visualization model based on parallel coordinates, denoted as parallel histograms through time.
We show how these diagrams can be used to explain the decision made by the machine learning model in choosing the drift point.
arXiv Detail & Related papers (2024-06-19T18:12:02Z) - Methods for Generating Drift in Text Streams [49.3179290313959]
Concept drift is a frequent phenomenon in real-world datasets and corresponds to changes in data distribution over time.
This paper provides four textual drift generation methods to ease the production of datasets with labeled drifts.
Results show that all methods have their performance degraded right after the drifts, and the incremental SVM is the fastest to run and recover the previous performance levels.
arXiv Detail & Related papers (2024-03-18T23:48:33Z) - Explaining Drift using Shapley Values [0.0]
Machine learning models often deteriorate in their performance when they are used to predict the outcomes over data on which they were not trained.
There is no framework to identify the drivers behind the drift in model performance.
We propose a novel framework - DBShap that uses principled Shapley values to identify the main contributors of the drift.
arXiv Detail & Related papers (2024-01-18T07:07:42Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be
Consistent [97.64313409741614]
We propose to enforce a emphconsistency property which states that predictions of the model on its own generated data are consistent across time.
We show that our novel training objective yields state-of-the-art results for conditional and unconditional generation in CIFAR-10 and baseline improvements in AFHQ and FFHQ.
arXiv Detail & Related papers (2023-02-17T18:45:04Z) - Change Detection for Local Explainability in Evolving Data Streams [72.4816340552763]
Local feature attribution methods have become a popular technique for post-hoc and model-agnostic explanations.
It is often unclear how local attributions behave in realistic, constantly evolving settings such as streaming and online applications.
We present CDLEEDS, a flexible and model-agnostic framework for detecting local change and concept drift.
arXiv Detail & Related papers (2022-09-06T18:38:34Z) - Deep learning model solves change point detection for multiple change
types [69.77452691994712]
A change points detection aims to catch an abrupt disorder in data distribution.
We propose an approach that works in the multiple-distributions scenario.
arXiv Detail & Related papers (2022-04-15T09:44:21Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Learning Parameter Distributions to Detect Concept Drift in Data Streams [13.20231558027132]
We propose a novel framework for the detection of real concept drift, called ERICS.
By treating the parameters of a predictive model as random variables, we show that concept drift corresponds to a change in the distribution of optimal parameters.
ERICS is also capable to detect concept drift at the input level, which is a significant advantage over existing approaches.
arXiv Detail & Related papers (2020-10-19T11:19:16Z) - Counterfactual Explanations of Concept Drift [11.53362411363005]
concept drift refers to the phenomenon that the distribution, which is underlying the observed data, changes over time.
We present a novel technology, which characterizes concept drift in terms of the characteristic change of spatial features represented by typical examples.
arXiv Detail & Related papers (2020-06-23T08:27:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.