Standardized Evaluation of Machine Learning Methods for Evolving Data
Streams
- URL: http://arxiv.org/abs/2204.13625v1
- Date: Thu, 28 Apr 2022 16:40:33 GMT
- Title: Standardized Evaluation of Machine Learning Methods for Evolving Data
Streams
- Authors: Johannes Haug, Effi Tramountani, Gjergji Kasneci
- Abstract summary: We propose a comprehensive set of properties for high-quality machine learning in evolving data streams.
We discuss sensible performance measures and evaluation strategies for online predictive modelling, online feature selection and concept drift detection.
The proposed evaluation standards are provided in a new Python framework called float.
- Score: 11.17545155325116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the unspecified and dynamic nature of data streams, online machine
learning requires powerful and flexible solutions. However, evaluating online
machine learning methods under realistic conditions is difficult. Existing work
therefore often draws on different heuristics and simulations that do not
necessarily produce meaningful and reliable results. Indeed, in the absence of
common evaluation standards, it often remains unclear how online learning
methods will perform in practice or in comparison to similar work. In this
paper, we propose a comprehensive set of properties for high-quality machine
learning in evolving data streams. In particular, we discuss sensible
performance measures and evaluation strategies for online predictive modelling,
online feature selection and concept drift detection. As one of the first
works, we also look at the interpretability of online learning methods. The
proposed evaluation standards are provided in a new Python framework called
float. Float is completely modular and allows the simultaneous integration of
common libraries, such as scikit-multiflow or river, with custom code. Float is
open-sourced and can be accessed at https://github.com/haugjo/float. In this
sense, we hope that our work will contribute to more standardized, reliable and
realistic testing and comparison of online machine learning methods.
Related papers
- A Comprehensive Empirical Evaluation on Online Continual Learning [20.39495058720296]
We evaluate methods from the literature that tackle online continual learning.
We focus on the class-incremental setting in the context of image classification.
We compare these methods on the Split-CIFAR100 and Split-TinyImagenet benchmarks.
arXiv Detail & Related papers (2023-08-20T17:52:02Z) - Real-Time Evaluation in Online Continual Learning: A New Hope [104.53052316526546]
We evaluate current Continual Learning (CL) methods with respect to their computational costs.
A simple baseline outperforms state-of-the-art CL methods under this evaluation.
This surprisingly suggests that the majority of existing CL literature is tailored to a specific class of streams that is not practical.
arXiv Detail & Related papers (2023-02-02T12:21:10Z) - Improving Behavioural Cloning with Positive Unlabeled Learning [15.484227081812852]
We propose a novel iterative learning algorithm for identifying expert trajectories in mixed-quality robotics datasets.
Applying behavioral cloning to the resulting filtered dataset outperforms several competitive offline reinforcement learning and imitation learning baselines.
arXiv Detail & Related papers (2023-01-27T14:17:45Z) - Towards Data-Driven Offline Simulations for Online Reinforcement
Learning [30.654163861164864]
We formalize offline learner simulation (OLS) for reinforcement learning (RL)
We propose a novel evaluation protocol that measures both fidelity and efficiency of the simulation.
arXiv Detail & Related papers (2022-11-14T18:36:13Z) - Online vs. Offline Adaptive Domain Randomization Benchmark [20.69035879843824]
We present an open benchmark for both offline and online methods (SimOpt, BayRn, DROID, DROPO) to shed light on which are most suitable for each setting and task at hand.
We found that online methods are limited by the quality of the currently learned policy for the next iteration, while offline methods may sometimes fail when replaying trajectories in simulation with open-loop commands.
arXiv Detail & Related papers (2022-06-29T14:03:53Z) - Passive learning to address nonstationarity in virtual flow metering
applications [0.0]
This paper explores how learning methods can be applied to sustain the prediction accuracy of steady-state virtual flow meters.
Two passive learning methods, periodic batch learning and online learning, are applied with varying calibration frequency to train virtual flow meters.
The results are two-fold: first, in the presence of frequently arriving measurements, frequent model updating sustains an excellent prediction performance over time; second, in the presence of intermittent and infrequently arriving measurements, frequent updating is essential to increase the performance accuracy.
arXiv Detail & Related papers (2022-02-07T14:42:00Z) - RvS: What is Essential for Offline RL via Supervised Learning? [77.91045677562802]
Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.
In every environment suite we consider simply maximizing likelihood with two-layer feedforward is competitive.
They also probe the limits of existing RvS methods, which are comparatively weak on random data.
arXiv Detail & Related papers (2021-12-20T18:55:16Z) - A Workflow for Offline Model-Free Robotic Reinforcement Learning [117.07743713715291]
offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction.
We develop a practical workflow for using offline RL analogous to the relatively well-understood for supervised learning problems.
We demonstrate the efficacy of this workflow in producing effective policies without any online tuning.
arXiv Detail & Related papers (2021-09-22T16:03:29Z) - Online Continual Learning with Natural Distribution Shifts: An Empirical
Study with Visual Data [101.6195176510611]
"Online" continual learning enables evaluating both information retention and online learning efficacy.
In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online.
We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts.
arXiv Detail & Related papers (2021-08-20T06:17:20Z) - Task-agnostic Continual Learning with Hybrid Probabilistic Models [75.01205414507243]
We propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification.
The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting.
We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.
arXiv Detail & Related papers (2021-06-24T05:19:26Z) - Low-Regret Active learning [64.36270166907788]
We develop an online learning algorithm for identifying unlabeled data points that are most informative for training.
At the core of our work is an efficient algorithm for sleeping experts that is tailored to achieve low regret on predictable (easy) instances.
arXiv Detail & Related papers (2021-04-06T22:53:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.