Mediated Uncoupled Learning: Learning Functions without Direct
Input-output Correspondences
- URL: http://arxiv.org/abs/2107.08135v1
- Date: Fri, 16 Jul 2021 22:13:29 GMT
- Title: Mediated Uncoupled Learning: Learning Functions without Direct
Input-output Correspondences
- Authors: Ikko Yamane, Junya Honda, Florian Yger, Masashi Sugiyama
- Abstract summary: We consider the task of predicting $Y$ from $X$ when we have no paired data of them.
A naive approach is to predict $U$ from $X$ using $S_X$ and then $Y$ from $U$ using $S_Y$.
We propose a new method that avoids predicting $U$ but directly learns $Y = f(X)$ by training $f(X)$ with $S_X$ to predict $h(U)$.
- Score: 80.95776331769899
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ordinary supervised learning is useful when we have paired training data of
input $X$ and output $Y$. However, such paired data can be difficult to collect
in practice. In this paper, we consider the task of predicting $Y$ from $X$
when we have no paired data of them, but we have two separate, independent
datasets of $X$ and $Y$ each observed with some mediating variable $U$, that
is, we have two datasets $S_X = \{(X_i, U_i)\}$ and $S_Y = \{(U'_j, Y'_j)\}$. A
naive approach is to predict $U$ from $X$ using $S_X$ and then $Y$ from $U$
using $S_Y$, but we show that this is not statistically consistent. Moreover,
predicting $U$ can be more difficult than predicting $Y$ in practice, e.g.,
when $U$ has higher dimensionality. To circumvent the difficulty, we propose a
new method that avoids predicting $U$ but directly learns $Y = f(X)$ by
training $f(X)$ with $S_{X}$ to predict $h(U)$ which is trained with $S_{Y}$ to
approximate $Y$. We prove statistical consistency and error bounds of our
method and experimentally confirm its practical usefulness.
Related papers
- Transformer In-Context Learning for Categorical Data [51.23121284812406]
We extend research on understanding Transformers through the lens of in-context learning with functional data by considering categorical outcomes, nonlinear underlying models, and nonlinear attention.
We present what is believed to be the first real-world demonstration of this few-shot-learning methodology, using the ImageNet dataset.
arXiv Detail & Related papers (2024-05-27T15:03:21Z) - Phase Transitions in the Detection of Correlated Databases [12.010807505655238]
We study the problem of detecting the correlation between two Gaussian databases $mathsfXinmathbbRntimes d$ and $mathsfYntimes d$, each composed of $n$ users with $d$ features.
This problem is relevant in the analysis of social media, computational biology, etc.
arXiv Detail & Related papers (2023-02-07T10:39:44Z) - Blessing of Class Diversity in Pre-training [54.335530406959435]
We prove that when the classes of the pre-training task are sufficiently diverse, pre-training can significantly improve the sample efficiency of downstream tasks.
Our proof relies on a vector-form Rademacher complexity chain rule for composite function classes and a modified self-concordance condition.
arXiv Detail & Related papers (2022-09-07T20:10:12Z) - Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle.
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z) - TURF: A Two-factor, Universal, Robust, Fast Distribution Learning
Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$.
We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z) - Agnostic learning with unknown utilities [70.14742836006042]
In many real-world problems, the utility of a decision depends on the underlying context $x$ and decision $y$.
We study this as agnostic learning with unknown utilities.
We show that estimating the utilities of only the sampled points$S$ suffices to learn a decision function which generalizes well.
arXiv Detail & Related papers (2021-04-17T08:22:04Z) - The Sparse Hausdorff Moment Problem, with Application to Topic Models [5.151973524974052]
We give an algorithm for identifying a $k$-mixture using samples of $m=2k$ iid binary random variables.
It suffices to know the moments to additive accuracy $w_mincdotzetaO(k)$.
arXiv Detail & Related papers (2020-07-16T04:23:57Z) - Faster Uncertainty Quantification for Inverse Problems with Conditional
Normalizing Flows [0.9176056742068814]
In inverse problems, we often have data consisting of paired samples $(x,y)sim p_X,Y(x,y)$ where $y$ are partial observations of a physical system.
We propose a two-step scheme, which makes use of normalizing flows and joint data to train a conditional generator $q_theta(x|y)$.
arXiv Detail & Related papers (2020-07-15T20:36:30Z) - Semi-Supervised Learning: the Case When Unlabeled Data is Equally Useful [5.045960549713147]
Semi-supervised learning algorithms attempt to take advantage of relatively inexpensive unlabeled data to improve learning performance.
We show that under certain conditions on the distribution, unlabeled data is equally useful as labeled date in terms of learning rate.
arXiv Detail & Related papers (2020-05-22T06:05:00Z) - Learning and Testing Variable Partitions [13.575794982844222]
We show that $mathcalO(k n2)(delta + epsilon)$ can be learned in time $tildemathcalO(n2 mathrmpoly (1/epsilon)$ for any $epsilon > 0$.
We also show that even two-sided testers require $Omega(n)$ queries when $k = 2$.
arXiv Detail & Related papers (2020-03-29T10:12:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.