Open-Source Drift Detection Tools in Action: Insights from Two Use Cases
- URL: http://arxiv.org/abs/2404.18673v2
- Date: Fri, 10 May 2024 11:20:47 GMT
- Title: Open-Source Drift Detection Tools in Action: Insights from Two Use Cases
- Authors: Rieke Müller, Mohamed Abdelaal, Davor Stjelja,
- Abstract summary: D3Bench examines the capabilities of Evidently AI, NannyML, and Alibi-Detect, leveraging real-world data from two smart building use cases.
We consider a comprehensive set of non-functional criteria, such as the integrability with ML pipelines, the adaptability to diverse data types, user-friendliness, computational efficiency, and resource demands.
Our findings reveal that Evidently AI stands out for its general data drift detection, whereas NannyML excels at pinpointing the precise timing of shifts and evaluating their consequent effects on predictive accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data drifts pose a critical challenge in the lifecycle of machine learning (ML) models, affecting their performance and reliability. In response to this challenge, we present a microbenchmark study, called D3Bench, which evaluates the efficacy of open-source drift detection tools. D3Bench examines the capabilities of Evidently AI, NannyML, and Alibi-Detect, leveraging real-world data from two smart building use cases.We prioritize assessing the functional suitability of these tools to identify and analyze data drifts. Furthermore, we consider a comprehensive set of non-functional criteria, such as the integrability with ML pipelines, the adaptability to diverse data types, user-friendliness, computational efficiency, and resource demands. Our findings reveal that Evidently AI stands out for its general data drift detection, whereas NannyML excels at pinpointing the precise timing of shifts and evaluating their consequent effects on predictive accuracy.
Related papers
- Flow Exporter Impact on Intelligent Intrusion Detection Systems [0.0]
High-quality datasets are critical for training machine learning models.
Inconsistencies in feature generation can hinder the accuracy and reliability of threat detection.
This paper investigates the impact of flow exporters on the performance and reliability of machine learning models for intrusion detection.
arXiv Detail & Related papers (2024-12-18T16:38:20Z) - Distribution Discrepancy and Feature Heterogeneity for Active 3D Object Detection [18.285299184361598]
LiDAR-based 3D object detection is a critical technology for the development of autonomous driving and robotics.
We propose a novel and effective active learning (AL) method called Distribution Discrepancy and Feature Heterogeneity (DDFH)
It simultaneously considers geometric features and model embeddings, assessing information from both the instance-level and frame-level perspectives.
arXiv Detail & Related papers (2024-09-09T08:26:11Z) - Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets.
We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z) - Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support*
Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit.
Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - DuEqNet: Dual-Equivariance Network in Outdoor 3D Object Detection for
Autonomous Driving [4.489333751818157]
We propose DuEqNet, which first introduces the concept of equivariance into 3D object detection network.
The dual-equivariant of our model can extract the equivariant features at both local and global levels.
Our model presents higher accuracy on orientation and better prediction efficiency.
arXiv Detail & Related papers (2023-02-27T08:30:02Z) - Resolving Class Imbalance for LiDAR-based Object Detector by Dynamic
Weight Average and Contextual Ground Truth Sampling [7.096611243139798]
Real-world driving datasets often suffer from the problem of data imbalance.
We propose a method to address this data imbalance problem.
Our experiment with KITTI and nuScenes datasets confirms our proposed method's effectiveness.
arXiv Detail & Related papers (2022-10-07T05:23:25Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - Understanding Programmatic Weak Supervision via Source-aware Influence
Function [76.74549130841383]
Programmatic Weak Supervision (PWS) aggregates the source votes of multiple weak supervision sources into probabilistic training labels.
We build on Influence Function (IF) to decompose the end model's training objective and then calculate the influence associated with each (data, source, class)
These primitive influence score can then be used to estimate the influence of individual component PWS, such as source vote, supervision source, and training data.
arXiv Detail & Related papers (2022-05-25T15:57:24Z) - Feature Extraction for Machine Learning-based Intrusion Detection in IoT
Networks [6.6147550436077776]
This paper aims to discover whether Feature Reduction (FR) and Machine Learning (ML) techniques can be generalised across various datasets.
The detection accuracy of three Feature Extraction (FE) algorithms; Principal Component Analysis (PCA), Auto-encoder (AE), and Linear Discriminant Analysis (LDA) is evaluated.
arXiv Detail & Related papers (2021-08-28T23:52:18Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.