Is Transductive Learning Equivalent to PAC Learning?
- URL: http://arxiv.org/abs/2405.05190v2
- Date: Tue, 29 Oct 2024 22:07:12 GMT
- Title: Is Transductive Learning Equivalent to PAC Learning?
- Authors: Shaddin Dughmi, Yusuf Kalayci, Grayson York,
- Abstract summary: We show that the PAC and transductive models are essentially equivalent for agnostic binary classification.
We leave as an intriguing open question whether our second result can be extended beyond binary classification to show the transductive and PAC models equivalent more broadly.
- Score: 0.9012198585960443
- License:
- Abstract: Much of learning theory is concerned with the design and analysis of probably approximately correct (PAC) learners. The closely related transductive model of learning has recently seen more scrutiny, with its learners often used as precursors to PAC learners. Our goal in this work is to understand and quantify the exact relationship between these two models. First, we observe that modest extensions of existing results show the models to be essentially equivalent for realizable learning for most natural loss functions, up to low order terms in the error and sample complexity. The situation for agnostic learning appears less straightforward, with sample complexities potentially separated by a $\frac{1}{\epsilon}$ factor. This is therefore where our main contributions lie. Our results are two-fold: 1. For agnostic learning with bounded losses (including, for example, multiclass classification), we show that PAC learning reduces to transductive learning at the cost of low-order terms in the error and sample complexity via an adaptation of the reduction of arXiv:2304.09167 to the agnostic setting. 2. For agnostic binary classification, we show the converse: transductive learning is essentially no more difficult than PAC learning. Together with our first result this implies that the PAC and transductive models are essentially equivalent for agnostic binary classification. This is our most technical result, and involves two steps: A symmetrization argument on the agnostic one-inclusion graph (OIG) of arXiv:2309.13692 to derive the worst-case agnostic transductive instance, and expressing the error of the agnostic OIG algorithm for this instance in terms of the empirical Rademacher complexity of the class. We leave as an intriguing open question whether our second result can be extended beyond binary classification to show the transductive and PAC models equivalent more broadly.
Related papers
- Error Exponent in Agnostic PAC Learning [4.772817128620037]
Probably Approximately Correct (PAC) is widely used to analyze learning problems and algorithms.
In this paper, we consider PAC learning using the error exponent - a well established analysis method in Information Theory.
We find, under some assumptions, an improved distribution dependent error exponent for a wide range of problems, establishing the exponential behavior of the PAC error probability in learning.
arXiv Detail & Related papers (2024-05-01T18:08:03Z) - Transductive Learning Is Compact [10.168670899305232]
We show a compactness result holding broadly across supervised learning with a general class of loss functions.
For realizable learning with improper metric losses, we show that exact compactness of sample complexity can fail.
We conjecture that larger gaps are possible for the agnostic case.
arXiv Detail & Related papers (2024-02-15T23:10:45Z) - On Learning Latent Models with Multi-Instance Weak Supervision [57.18649648182171]
We consider a weakly supervised learning scenario where the supervision signal is generated by a transition function $sigma$ labels associated with multiple input instances.
Our problem is met in different fields, including latent structural learning and neuro-symbolic integration.
arXiv Detail & Related papers (2023-06-23T22:05:08Z) - Learnability, Sample Complexity, and Hypothesis Class Complexity for
Regression Models [10.66048003460524]
This work is inspired by the foundation of PAC and is motivated by the existing regression learning issues.
The proposed approach, denoted by epsilon-Confidence Approximately Correct (epsilon CoAC), utilizes Kullback Leibler divergence (relative entropy)
It enables the learner to compare hypothesis classes of different complexity orders and choose among them the optimum with the minimum epsilon.
arXiv Detail & Related papers (2023-03-28T15:59:12Z) - Agnostic PAC Learning of k-juntas Using L2-Polynomial Regression [9.732863739456036]
We present a new PAC learning algorithm based on the Fourier expansion with lower computational complexity.
We prove our results by connecting the PAC learning with 0-1 loss to the minimum mean square estimation problem.
We derive an elegant upper bound on the 0-1 loss in terms of the MMSE error and show that the sign of the MMSE is a PAC learner for any concept class containing it.
arXiv Detail & Related papers (2023-03-08T19:54:07Z) - Instance-Dependent Label-Noise Learning with Manifold-Regularized
Transition Matrix Estimation [172.81824511381984]
The transition matrix T(x) is unidentifiable under the instance-dependent noise(IDN)
We propose assumption on the geometry of T(x) that "the closer two instances are, the more similar their corresponding transition matrices should be"
Our method is superior to state-of-the-art approaches for label-noise learning under the challenging IDN.
arXiv Detail & Related papers (2022-06-06T04:12:01Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - The Optimality of Polynomial Regression for Agnostic Learning under
Gaussian Marginals [47.81107898315438]
We develop a method for finding hard families of examples for a wide class of problems by using duality LP.
We show that the $L1$-regression is essentially best possible, and therefore that the computational difficulty of learning a concept class is closely related to the degree required to approximate any function from the class in $L1$-norm.
arXiv Detail & Related papers (2021-02-08T18:06:32Z) - The Complexity of Adversarially Robust Proper Learning of Halfspaces
with Agnostic Noise [67.27523616312428]
We study the computational complexity of adversarially robust proper learning of halfspaces in the distribution-independent PAC model.
We give a computationally efficient learning algorithm and a nearly matching computational hardness result for this problem.
arXiv Detail & Related papers (2020-07-30T04:18:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.