Diagnosing Concept Drift with Visual Analytics
- URL: http://arxiv.org/abs/2007.14372v3
- Date: Tue, 15 Sep 2020 04:12:44 GMT
- Title: Diagnosing Concept Drift with Visual Analytics
- Authors: Weikai Yang, Zhen Li, Mengchen Liu, Yafeng Lu, Kelei Cao, Ross
Maciejewski, Shixia Liu
- Abstract summary: Concept drift is a phenomenon in which the distribution of a data stream changes over time in unforeseen ways, causing prediction models to become inaccurate.
We present a visual analytics method, DriftVis, to support model builders and analysts in the identification and correction of concept drift in streaming data.
- Score: 27.836419202828303
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Concept drift is a phenomenon in which the distribution of a data stream
changes over time in unforeseen ways, causing prediction models built on
historical data to become inaccurate. While a variety of automated methods have
been developed to identify when concept drift occurs, there is limited support
for analysts who need to understand and correct their models when drift is
detected. In this paper, we present a visual analytics method, DriftVis, to
support model builders and analysts in the identification and correction of
concept drift in streaming data. DriftVis combines a distribution-based drift
detection method with a streaming scatterplot to support the analysis of drift
caused by the distribution changes of data streams and to explore the impact of
these changes on the model's accuracy. A quantitative experiment and two case
studies on weather prediction and text classification have been conducted to
demonstrate our proposed tool and illustrate how visual analytics can be used
to support the detection, examination, and correction of concept drift.
Related papers
- Physics-guided Active Sample Reweighting for Urban Flow Prediction [75.24539704456791]
Urban flow prediction is a nuanced-temporal modeling that estimates the throughput of transportation services like buses, taxis and ride-driven models.
Some recent prediction solutions bring remedies with the notion of physics-guided machine learning (PGML)
We develop a atized physics-guided network (PN), and propose a data-aware framework Physics-guided Active Sample Reweighting (P-GASR)
arXiv Detail & Related papers (2024-07-18T15:44:23Z) - Online Drift Detection with Maximum Concept Discrepancy [13.48123472458282]
We propose MCD-DD, a novel concept drift detection method based on maximum concept discrepancy.
Our method can adaptively identify varying forms of concept drift by contrastive learning of concept embeddings.
arXiv Detail & Related papers (2024-07-07T13:57:50Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - On the Change of Decision Boundaries and Loss in Learning with Concept
Drift [8.686667049158476]
Concept drift refers to the phenomenon that the distribution generating the observed data changes over time.
Many technologies for learning with drift rely on the interleaved test-train error (ITTE) as a quantity which approximates the model generalization error.
arXiv Detail & Related papers (2022-12-02T14:58:13Z) - Autoregressive based Drift Detection Method [0.0]
We propose a new concept drift detection method based on autoregressive models called ADDM.
Our results show that this new concept drift detection method outperforms the state-of-the-art drift detection methods.
arXiv Detail & Related papers (2022-03-09T14:36:16Z) - Detecting Concept Drift With Neural Network Model Uncertainty [0.0]
Uncertainty Drift Detection (UDD) is able to detect drifts without access to true labels.
In contrast to input data-based drift detection, our approach considers the effects of the current input data on the properties of the prediction model.
We show that UDD outperforms other state-of-the-art strategies on two synthetic as well as ten real-world data sets for both regression and classification tasks.
arXiv Detail & Related papers (2021-07-05T08:56:36Z) - Automatic Learning to Detect Concept Drift [40.69280758487987]
We propose Meta-ADD, a novel framework that learns to classify concept drift by tracking the changed pattern of error rates.
Specifically, in the training phase, we extract meta-features based on the error rates of various concept drift, after which a meta-detector is developed via prototypical neural network.
In the detection phase, the learned meta-detector is fine-tuned to adapt to the corresponding data stream via stream-based active learning.
arXiv Detail & Related papers (2021-05-04T11:10:39Z) - Predicting traffic signals on transportation networks using
spatio-temporal correlations on graphs [56.48498624951417]
This paper proposes a traffic propagation model that merges multiple heat diffusion kernels into a data-driven prediction model to forecast traffic signals.
We optimize the model parameters using Bayesian inference to minimize the prediction errors and, consequently, determine the mixing ratio of the two approaches.
The proposed model demonstrates prediction accuracy comparable to that of the state-of-the-art deep neural networks with lower computational effort.
arXiv Detail & Related papers (2021-04-27T18:17:42Z) - A Graph Convolutional Network with Signal Phasing Information for
Arterial Traffic Prediction [63.470149585093665]
arterial traffic prediction plays a crucial role in the development of modern intelligent transportation systems.
Many existing studies on arterial traffic prediction only consider temporal measurements of flow and occupancy from loop sensors and neglect the rich spatial relationships between upstream and downstream detectors.
We fill this gap by enhancing a deep learning approach, Diffusion Convolutional Recurrent Neural Network, with spatial information generated from signal timing plans at targeted intersections.
arXiv Detail & Related papers (2020-12-25T01:40:29Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.