A critical look at the current train/test split in machine learning
- URL: http://arxiv.org/abs/2106.04525v1
- Date: Tue, 8 Jun 2021 17:07:20 GMT
- Title: A critical look at the current train/test split in machine learning
- Authors: Jimin Tan, Jianan Yang, Sai Wu, Gang Chen, Jake Zhao (Junbo)
- Abstract summary: We take a closer look at the split protocol itself and point out its weakness and limitation.
In many real-world problems, we must acknowledge that there are numerous situations where assumption (ii) does not hold.
We propose a new adaptive active learning architecture (AAL) which involves an adaptation policy.
- Score: 6.475859946760842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The randomized or cross-validated split of training and testing sets has been
adopted as the gold standard of machine learning for decades. The establishment
of these split protocols are based on two assumptions: (i)-fixing the dataset
to be eternally static so we could evaluate different machine learning
algorithms or models; (ii)-there is a complete set of annotated data available
to researchers or industrial practitioners. However, in this article, we intend
to take a closer and critical look at the split protocol itself and point out
its weakness and limitation, especially for industrial applications. In many
real-world problems, we must acknowledge that there are numerous situations
where assumption (ii) does not hold. For instance, for interdisciplinary
applications like drug discovery, it often requires real lab experiments to
annotate data which poses huge costs in both time and financial considerations.
In other words, it can be very difficult or even impossible to satisfy
assumption (ii). In this article, we intend to access this problem and
reiterate the paradigm of active learning, and investigate its potential on
solving problems under unconventional train/test split protocols. We further
propose a new adaptive active learning architecture (AAL) which involves an
adaptation policy, in comparison with the traditional active learning that only
unidirectionally adds data points to the training pool. We primarily justify
our points by extensively investigating an interdisciplinary drug-protein
binding problem. We additionally evaluate AAL on more conventional machine
learning benchmarking datasets like CIFAR-10 to demonstrate the
generalizability and efficacy of the new framework.
Related papers
- Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning.
One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems.
We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z) - Towards Understanding the Feasibility of Machine Unlearning [14.177012256360635]
We present a set of novel metrics for quantifying the difficulty of unlearning.
Specifically, we propose several metrics to assess the conditions necessary for a successful unlearning operation.
We also present a ranking mechanism to identify the most challenging samples to unlearn.
arXiv Detail & Related papers (2024-10-03T23:41:42Z) - Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - TESSERACT: Eliminating Experimental Bias in Malware Classification
across Space and Time (Extended Version) [18.146377453918724]
Malware detectors often experience performance decay due to constantly evolving operating systems and attack methods.
This paper argues that commonly reported results are inflated due to two pervasive sources of experimental bias in the detection task.
arXiv Detail & Related papers (2024-02-02T12:27:32Z) - Continual Learning with Pre-Trained Models: A Survey [61.97613090666247]
Continual Learning aims to overcome the catastrophic forgetting of former knowledge when learning new ones.
This paper presents a comprehensive survey of the latest advancements in PTM-based CL.
arXiv Detail & Related papers (2024-01-29T18:27:52Z) - Multivariate Time Series Anomaly Detection: Fancy Algorithms and Flawed
Evaluation Methodology [2.043517674271996]
We discuss how a normally good protocol may have weaknesses in the context of MVTS anomaly detection.
We propose a simple, yet challenging, baseline based on Principal Components Analysis (PCA) that surprisingly outperforms many recent Deep Learning (DL) based approaches on popular benchmark datasets.
arXiv Detail & Related papers (2023-08-24T20:24:12Z) - Federated Unlearning via Active Forgetting [24.060724751342047]
We propose a novel federated unlearning framework based on incremental learning.
Our framework differs from existing federated unlearning methods that rely on approximate retraining or data influence estimation.
arXiv Detail & Related papers (2023-07-07T03:07:26Z) - NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision
Research [96.53307645791179]
We introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks.
Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth.
Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks.
arXiv Detail & Related papers (2022-11-15T18:57:46Z) - Evaluating Machine Unlearning via Epistemic Uncertainty [78.27542864367821]
This work presents an evaluation of Machine Unlearning algorithms based on uncertainty.
This is the first definition of a general evaluation of our best knowledge.
arXiv Detail & Related papers (2022-08-23T09:37:31Z) - Just Label What You Need: Fine-Grained Active Selection for Perception
and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes.
Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z) - A survey on domain adaptation theory: learning bounds and theoretical
guarantees [17.71634393160982]
The main objective of this survey is to provide an overview of the state-of-the-art theoretical results in a specific, and arguably the most popular, sub-field of transfer learning.
In this sub-field, the data distribution is assumed to change across the training and the test data, while the learning task remains the same.
We provide a first up-to-date description of existing results related to domain adaptation problem.
arXiv Detail & Related papers (2020-04-24T16:11:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.