Related papers: A Link between Coding Theory and Cross-Validation with Applications

A Link between Coding Theory and Cross-Validation with Applications

URL: http://arxiv.org/abs/2103.11856v3
Date: Fri, 9 Feb 2024 09:48:46 GMT
Title: A Link between Coding Theory and Cross-Validation with Applications
Authors: Tapio Pahikkala, Parisa Movahedi, Ileana Montoya, Havu Miikonen, Stephan Foldes, Antti Airola, Laszlo Major
Abstract summary: We show that the exact answers are given by the theory of error detecting codes. As an immediate practical application, we develop new LPOCV based randomization tests for learning algorithms.
Score: 2.287027904771153
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: How many different binary classification problems a single learning algorithm can solve on a fixed data with exactly zero or at most a given number of cross-validation errors? While the number in the former case is known to be limited by the no-free-lunch theorem, we show that the exact answers are given by the theory of error detecting codes. As a case study, we focus on the AUC performance measure and leave-pair-out cross-validation (LPOCV), in which every possible pair of data with different class labels is held out at a time. We show that the maximal number of classification problems with fixed class proportion, for which a learning algorithm can achieve zero LPOCV error, equals the maximal number of code words in a constant weight code (CWC), with certain technical properties. We then generalize CWCs by introducing light CWCs, and prove an analogous result for nonzero LPOCV errors and light CWCs. Moreover, we prove both upper and lower bounds on the maximal numbers of code words in light CWCs. Finally, as an immediate practical application, we develop new LPOCV based randomization tests for learning algorithms that generalize the classical Wilcoxon-Mann-Whitney U test.

Related papers

Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models. We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z)
Limits to classification performance by relating Kullback-Leibler divergence to Cohen's Kappa [0.0]
Theory and methods are discussed in detail and then applied to Monte Carlo data and real datasets. In all cases this analysis shows that the algorithms could not have performed any better due to the underlying probability density functions for the two classes.
arXiv Detail & Related papers (2024-03-03T17:36:42Z)
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation [121.0693322732454]
Contrastive Language-Image Pretraining (CLIP) has gained popularity for its remarkable zero-shot capacity. Recent research has focused on developing efficient fine-tuning methods to enhance CLIP's performance in downstream tasks. We revisit a classical algorithm, Gaussian Discriminant Analysis (GDA), and apply it to the downstream classification of CLIP.
arXiv Detail & Related papers (2024-02-06T15:45:27Z)
Understanding and Mitigating Classification Errors Through Interpretable Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors. We propose to discover those patterns of tokens that distinguish correct and erroneous predictions. We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z)
Hierarchical Semi-Supervised Contrastive Learning for Contamination-Resistant Anomaly Detection [81.07346419422605]
Anomaly detection aims at identifying deviant samples from the normal data distribution. Contrastive learning has provided a successful way to sample representation that enables effective discrimination on anomalies. We propose a novel hierarchical semi-supervised contrastive learning framework, for contamination-resistant anomaly detection.
arXiv Detail & Related papers (2022-07-24T18:49:26Z)
Error-rate-agnostic decoding of topological stabilizer codes [0.0]
We develop a decoder that depends on the bias, i.e., the relative probability of phase-flip to bit-flip errors, but is agnostic to error rate. Our decoder is based on counting the number and effective weight of the most likely error chains in each equivalence class of a given syndrome.
arXiv Detail & Related papers (2021-12-03T15:45:12Z)
Information-Theoretic Generalization Bounds for Iterative Semi-Supervised Learning [81.1071978288003]
In particular, we seek to understand the behaviour of the em generalization error of iterative SSL algorithms using information-theoretic principles. Our theoretical results suggest that when the class conditional variances are not too large, the upper bound on the generalization error decreases monotonically with the number of iterations, but quickly saturates.
arXiv Detail & Related papers (2021-10-03T05:38:49Z)
Ensemble Learning using Error Correcting Output Codes: New Classification Error Bounds [2.0242396022517752]
We present new bounds on classification error rates for the error-correcting output code (ECOC) approach in machine learning. These bounds have exponential decay complexity with respect to codeword length and theoretically validate the effectiveness of the ECOC approach.
arXiv Detail & Related papers (2021-09-18T16:47:57Z)
CCMN: A General Framework for Learning with Class-Conditional Multi-Label Noise [40.46921277898713]
Class-conditional noise commonly exists in machine learning tasks, where the class label is corrupted with a probability depending on its ground-truth. In this paper, we formalize this problem as a general framework of learning with Class-Conditional Multi-label Noise ( CCMN for short) We establish two unbiased estimators with error bounds for solving the CCMN problems, and prove that they are consistent with commonly used multi-label loss functions.
arXiv Detail & Related papers (2021-05-16T03:24:15Z)
Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification. Our analysis reveals that the classification accuracy is highly distribution-dependent. The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z)
Generalized Zero-Shot Learning Via Over-Complete Distribution [79.5140590952889]
We propose to generate an Over-Complete Distribution (OCD) using Conditional Variational Autoencoder (CVAE) of both seen and unseen classes. The effectiveness of the framework is evaluated using both Zero-Shot Learning and Generalized Zero-Shot Learning protocols.
arXiv Detail & Related papers (2020-04-01T19:05:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.