Related papers: A critical look at the current train/test split in machine learning

A critical look at the current train/test split in machine learning

URL: http://arxiv.org/abs/2106.04525v1
Date: Tue, 8 Jun 2021 17:07:20 GMT
Title: A critical look at the current train/test split in machine learning
Authors: Jimin Tan, Jianan Yang, Sai Wu, Gang Chen, Jake Zhao (Junbo)
Abstract summary: We take a closer look at the split protocol itself and point out its weakness and limitation. In many real-world problems, we must acknowledge that there are numerous situations where assumption (ii) does not hold. We propose a new adaptive active learning architecture (AAL) which involves an adaptation policy.
Score: 6.475859946760842
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The randomized or cross-validated split of training and testing sets has been adopted as the gold standard of machine learning for decades. The establishment of these split protocols are based on two assumptions: (i)-fixing the dataset to be eternally static so we could evaluate different machine learning algorithms or models; (ii)-there is a complete set of annotated data available to researchers or industrial practitioners. However, in this article, we intend to take a closer and critical look at the split protocol itself and point out its weakness and limitation, especially for industrial applications. In many real-world problems, we must acknowledge that there are numerous situations where assumption (ii) does not hold. For instance, for interdisciplinary applications like drug discovery, it often requires real lab experiments to annotate data which poses huge costs in both time and financial considerations. In other words, it can be very difficult or even impossible to satisfy assumption (ii). In this article, we intend to access this problem and reiterate the paradigm of active learning, and investigate its potential on solving problems under unconventional train/test split protocols. We further propose a new adaptive active learning architecture (AAL) which involves an adaptation policy, in comparison with the traditional active learning that only unidirectionally adds data points to the training pool. We primarily justify our points by extensively investigating an interdisciplinary drug-protein binding problem. We additionally evaluate AAL on more conventional machine learning benchmarking datasets like CIFAR-10 to demonstrate the generalizability and efficacy of the new framework.

Related papers

When Simple Model Just Works: Is Network Traffic Classification in Crisis? [0.0]
We show that a simple k-NN baseline using packet sequences metadata can be on par or even outperform more complex methods.<n>We argue that standard machine learning practices adapted from domains like NLP or computer vision may be ill-suited for network traffic classification.
arXiv Detail & Related papers (2025-06-10T10:11:05Z)
Online Prototypes and Class-Wise Hypergradients for Online Continual Learning with Pre-Trained Models [24.963242232471426]
Continual Learning (CL) addresses the problem of learning from a data sequence where the distribution changes over time. In this paper, we tackle both problems by leveraging Online Prototypes (OP) and Class-Wise Hypergradients (CWH) OP leverages stable output representations of PTM by updating its value on the fly to act as replay samples without requiring task boundaries. CWH learns class-dependent gradient coefficients during training to improve over sub-optimal learning rates.
arXiv Detail & Related papers (2025-02-26T02:43:54Z)
Generate to Discriminate: Expert Routing for Continual Learning [59.71853576559306]
Generate to Discriminate (G2D) is a continual learning method that leverages synthetic data to train a domain-discriminator. We observe that G2D outperforms competitive domain-incremental learning methods on tasks in both vision and language modalities.
arXiv Detail & Related papers (2024-12-22T13:16:28Z)
Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning. One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems. We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z)
IT$^3$: Idempotent Test-Time Training [95.78053599609044]
Deep learning models often struggle when deployed in real-world settings due to distribution shifts between training and test data.<n>We present Idempotent Test-Time Training (IT$3$), a novel approach that enables on-the-fly adaptation to distribution shifts using only the current test instance.<n>Our results suggest that idempotence provides a universal principle for test-time adaptation that generalizes across domains and architectures.
arXiv Detail & Related papers (2024-10-05T15:39:51Z)
Towards Understanding the Feasibility of Machine Unlearning [14.177012256360635]
We present a set of novel metrics for quantifying the difficulty of unlearning. Specifically, we propose several metrics to assess the conditions necessary for a successful unlearning operation. We also present a ranking mechanism to identify the most challenging samples to unlearn.
arXiv Detail & Related papers (2024-10-03T23:41:42Z)
Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues. We propose a novel approach to address this issue at test time without requiring retraining. MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z)
TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time (Extended Version) [18.146377453918724]
Malware detectors often experience performance decay due to constantly evolving operating systems and attack methods. This paper argues that commonly reported results are inflated due to two pervasive sources of experimental bias in the detection task.
arXiv Detail & Related papers (2024-02-02T12:27:32Z)
Continual Learning with Pre-Trained Models: A Survey [61.97613090666247]
Continual Learning aims to overcome the catastrophic forgetting of former knowledge when learning new ones. This paper presents a comprehensive survey of the latest advancements in PTM-based CL.
arXiv Detail & Related papers (2024-01-29T18:27:52Z)
Negotiated Representations for Machine Mearning Application [0.0]
Overfitting is a phenomenon that occurs when a machine learning model is trained for too long and focused too much on the exact fitness of the training samples to the provided training labels. We present an approach that increases the classification accuracy of machine learning models by allowing the model to negotiate output representations of the samples with previously determined class labels.
arXiv Detail & Related papers (2023-11-19T19:53:49Z)
Multivariate Time Series Anomaly Detection: Fancy Algorithms and Flawed Evaluation Methodology [2.043517674271996]
We discuss how a normally good protocol may have weaknesses in the context of MVTS anomaly detection. We propose a simple, yet challenging, baseline based on Principal Components Analysis (PCA) that surprisingly outperforms many recent Deep Learning (DL) based approaches on popular benchmark datasets.
arXiv Detail & Related papers (2023-08-24T20:24:12Z)
Federated Unlearning via Active Forgetting [24.060724751342047]
We propose a novel federated unlearning framework based on incremental learning. Our framework differs from existing federated unlearning methods that rely on approximate retraining or data influence estimation.
arXiv Detail & Related papers (2023-07-07T03:07:26Z)
NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research [96.53307645791179]
We introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks. Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth. Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks.
arXiv Detail & Related papers (2022-11-15T18:57:46Z)
Evaluating Machine Unlearning via Epistemic Uncertainty [78.27542864367821]
This work presents an evaluation of Machine Unlearning algorithms based on uncertainty. This is the first definition of a general evaluation of our best knowledge.
arXiv Detail & Related papers (2022-08-23T09:37:31Z)
Just Label What You Need: Fine-Grained Active Selection for Perception and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes. Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z)
A survey on domain adaptation theory: learning bounds and theoretical guarantees [17.71634393160982]
The main objective of this survey is to provide an overview of the state-of-the-art theoretical results in a specific, and arguably the most popular, sub-field of transfer learning. In this sub-field, the data distribution is assumed to change across the training and the test data, while the learning task remains the same. We provide a first up-to-date description of existing results related to domain adaptation problem.
arXiv Detail & Related papers (2020-04-24T16:11:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.