A Link between Coding Theory and Cross-Validation with Applications
- URL: http://arxiv.org/abs/2103.11856v3
- Date: Fri, 9 Feb 2024 09:48:46 GMT
- Title: A Link between Coding Theory and Cross-Validation with Applications
- Authors: Tapio Pahikkala, Parisa Movahedi, Ileana Montoya, Havu Miikonen,
Stephan Foldes, Antti Airola, Laszlo Major
- Abstract summary: We show that the exact answers are given by the theory of error detecting codes.
As an immediate practical application, we develop new LPOCV based randomization tests for learning algorithms.
- Score: 2.287027904771153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How many different binary classification problems a single learning algorithm
can solve on a fixed data with exactly zero or at most a given number of
cross-validation errors? While the number in the former case is known to be
limited by the no-free-lunch theorem, we show that the exact answers are given
by the theory of error detecting codes. As a case study, we focus on the AUC
performance measure and leave-pair-out cross-validation (LPOCV), in which every
possible pair of data with different class labels is held out at a time. We
show that the maximal number of classification problems with fixed class
proportion, for which a learning algorithm can achieve zero LPOCV error, equals
the maximal number of code words in a constant weight code (CWC), with certain
technical properties. We then generalize CWCs by introducing light CWCs, and
prove an analogous result for nonzero LPOCV errors and light CWCs. Moreover, we
prove both upper and lower bounds on the maximal numbers of code words in light
CWCs. Finally, as an immediate practical application, we develop new LPOCV
based randomization tests for learning algorithms that generalize the classical
Wilcoxon-Mann-Whitney U test.
Related papers
- Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models.
We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z) - Limits to classification performance by relating Kullback-Leibler
divergence to Cohen's Kappa [0.0]
Theory and methods are discussed in detail and then applied to Monte Carlo data and real datasets.
In all cases this analysis shows that the algorithms could not have performed any better due to the underlying probability density functions for the two classes.
arXiv Detail & Related papers (2024-03-03T17:36:42Z) - A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation [121.0693322732454]
Contrastive Language-Image Pretraining (CLIP) has gained popularity for its remarkable zero-shot capacity.
Recent research has focused on developing efficient fine-tuning methods to enhance CLIP's performance in downstream tasks.
We revisit a classical algorithm, Gaussian Discriminant Analysis (GDA), and apply it to the downstream classification of CLIP.
arXiv Detail & Related papers (2024-02-06T15:45:27Z) - Understanding and Mitigating Classification Errors Through Interpretable
Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors.
We propose to discover those patterns of tokens that distinguish correct and erroneous predictions.
We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z) - Hierarchical Semi-Supervised Contrastive Learning for
Contamination-Resistant Anomaly Detection [81.07346419422605]
Anomaly detection aims at identifying deviant samples from the normal data distribution.
Contrastive learning has provided a successful way to sample representation that enables effective discrimination on anomalies.
We propose a novel hierarchical semi-supervised contrastive learning framework, for contamination-resistant anomaly detection.
arXiv Detail & Related papers (2022-07-24T18:49:26Z) - Error-rate-agnostic decoding of topological stabilizer codes [0.0]
We develop a decoder that depends on the bias, i.e., the relative probability of phase-flip to bit-flip errors, but is agnostic to error rate.
Our decoder is based on counting the number and effective weight of the most likely error chains in each equivalence class of a given syndrome.
arXiv Detail & Related papers (2021-12-03T15:45:12Z) - Information-Theoretic Generalization Bounds for Iterative
Semi-Supervised Learning [81.1071978288003]
In particular, we seek to understand the behaviour of the em generalization error of iterative SSL algorithms using information-theoretic principles.
Our theoretical results suggest that when the class conditional variances are not too large, the upper bound on the generalization error decreases monotonically with the number of iterations, but quickly saturates.
arXiv Detail & Related papers (2021-10-03T05:38:49Z) - Ensemble Learning using Error Correcting Output Codes: New
Classification Error Bounds [2.0242396022517752]
We present new bounds on classification error rates for the error-correcting output code (ECOC) approach in machine learning.
These bounds have exponential decay complexity with respect to codeword length and theoretically validate the effectiveness of the ECOC approach.
arXiv Detail & Related papers (2021-09-18T16:47:57Z) - CCMN: A General Framework for Learning with Class-Conditional
Multi-Label Noise [40.46921277898713]
Class-conditional noise commonly exists in machine learning tasks, where the class label is corrupted with a probability depending on its ground-truth.
In this paper, we formalize this problem as a general framework of learning with Class-Conditional Multi-label Noise ( CCMN for short)
We establish two unbiased estimators with error bounds for solving the CCMN problems, and prove that they are consistent with commonly used multi-label loss functions.
arXiv Detail & Related papers (2021-05-16T03:24:15Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Generalized Zero-Shot Learning Via Over-Complete Distribution [79.5140590952889]
We propose to generate an Over-Complete Distribution (OCD) using Conditional Variational Autoencoder (CVAE) of both seen and unseen classes.
The effectiveness of the framework is evaluated using both Zero-Shot Learning and Generalized Zero-Shot Learning protocols.
arXiv Detail & Related papers (2020-04-01T19:05:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.