Do Ensembling and Meta-Learning Improve Outlier Detection in Randomized
Controlled Trials?
- URL: http://arxiv.org/abs/2311.05473v1
- Date: Thu, 9 Nov 2023 16:05:38 GMT
- Title: Do Ensembling and Meta-Learning Improve Outlier Detection in Randomized
Controlled Trials?
- Authors: Walter Nelson, Jonathan Ranisau, Jeremy Petch
- Abstract summary: We evaluate 6 modern machine learning-based outlier detection algorithms on the task of identifying irregular data in 838 datasets from 7 real-world MCRCTs.
Our results reinforce key findings from prior work in the outlier detection literature on data from other domains.
We propose the Meta-learned Probabilistic Ensemble (MePE), a simple algorithm for aggregating the predictions of multiple unsupervised models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern multi-centre randomized controlled trials (MCRCTs) collect massive
amounts of tabular data, and are monitored intensively for irregularities by
humans. We began by empirically evaluating 6 modern machine learning-based
outlier detection algorithms on the task of identifying irregular data in 838
datasets from 7 real-world MCRCTs with a total of 77,001 patients from over 44
countries. Our results reinforce key findings from prior work in the outlier
detection literature on data from other domains. Existing algorithms often
succeed at identifying irregularities without any supervision, with at least
one algorithm exhibiting positive performance 70.6% of the time. However,
performance across datasets varies substantially with no single algorithm
performing consistently well, motivating new techniques for unsupervised model
selection or other means of aggregating potentially discordant predictions from
multiple candidate models. We propose the Meta-learned Probabilistic Ensemble
(MePE), a simple algorithm for aggregating the predictions of multiple
unsupervised models, and show that it performs favourably compared to recent
meta-learning approaches for outlier detection model selection. While
meta-learning shows promise, small ensembles outperform all forms of
meta-learning on average, a negative result that may guide the application of
current outlier detection approaches in healthcare and other real-world
domains.
Related papers
- A Dataset for Evaluating Online Anomaly Detection Approaches for Discrete Multivariate Time Series [0.01874930567916036]
Current publicly available datasets are too small, not diverse and feature trivial anomalies.
We propose a solution: a diverse, extensive, and non-trivial dataset generated via state-of-the-art simulation tools.
We make different versions of the dataset available, where training and test subsets are offered in contaminated and clean versions.
As expected, the baseline experimentation shows that the approaches trained on the semi-supervised version of the dataset outperform their unsupervised counterparts.
arXiv Detail & Related papers (2024-11-21T09:03:12Z) - Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Boosting Out-of-Distribution Detection with Multiple Pre-trained Models [41.66566916581451]
Post hoc detection utilizing pre-trained models has shown promising performance and can be scaled to large-scale problems.
We propose a detection enhancement method by ensembling multiple detection decisions derived from a zoo of pre-trained models.
Our method substantially improves the relative performance by 65.40% and 26.96% on the CIFAR10 and ImageNet benchmarks.
arXiv Detail & Related papers (2022-12-24T12:11:38Z) - Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective [67.45111837188685]
Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data.
We experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning.
arXiv Detail & Related papers (2022-06-16T11:44:11Z) - Zero-shot meta-learning for small-scale data from human subjects [10.320654885121346]
We develop a framework to rapidly adapt to a new prediction task with limited training data for out-of-sample test data.
Our model learns the latent treatment effects of each intervention and, by design, can naturally handle multi-task predictions.
Our model has implications for improved generalization of small-size human studies to the wider population.
arXiv Detail & Related papers (2022-03-29T17:42:04Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - RLAD: Time Series Anomaly Detection through Reinforcement Learning and
Active Learning [17.089402177923297]
We introduce a new semi-supervised, time series anomaly detection algorithm.
It uses deep reinforcement learning and active learning to efficiently learn and adapt to anomalies in real-world time series data.
It requires no manual tuning of parameters and outperforms all state-of-art methods we compare with.
arXiv Detail & Related papers (2021-03-31T15:21:15Z) - Meta-learning One-class Classifiers with Eigenvalue Solvers for
Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection.
We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z) - CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus [62.86856923633923]
We present a robust estimator for fitting multiple parametric models of the same form to noisy measurements.
In contrast to previous works, which resorted to hand-crafted search strategies for multiple model detection, we learn the search strategy from data.
For self-supervised learning of the search, we evaluate the proposed algorithm on multi-homography estimation and demonstrate an accuracy that is superior to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-08T17:37:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.