A Distribution-Dependent Analysis of Meta-Learning
- URL: http://arxiv.org/abs/2011.00344v3
- Date: Mon, 14 Jun 2021 03:38:06 GMT
- Title: A Distribution-Dependent Analysis of Meta-Learning
- Authors: Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesv\'ari
- Abstract summary: Key problem in the theory of meta-learning is to understand how the task distributions influence transfer risk.
In this paper, we give distribution-dependent lower bounds on the transfer risk of any algorithm.
We show that a novel, weighted version of the so-called biased regularized regression method is able to match these lower bounds up to a fixed constant factor.
- Score: 13.24264919706183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A key problem in the theory of meta-learning is to understand how the task
distributions influence transfer risk, the expected error of a meta-learner on
a new task drawn from the unknown task distribution. In this paper, focusing on
fixed design linear regression with Gaussian noise and a Gaussian task (or
parameter) distribution, we give distribution-dependent lower bounds on the
transfer risk of any algorithm, while we also show that a novel, weighted
version of the so-called biased regularized regression method is able to match
these lower bounds up to a fixed constant factor. Notably, the weighting is
derived from the covariance of the Gaussian task distribution. Altogether, our
results provide a precise characterization of the difficulty of meta-learning
in this Gaussian setting. While this problem setting may appear simple, we show
that it is rich enough to unify the "parameter sharing" and "representation
learning" streams of meta-learning; in particular, representation learning is
obtained as the special case when the covariance matrix of the task
distribution is unknown. For this case we propose to adopt the EM method, which
is shown to enjoy efficient updates in our case. The paper is completed by an
empirical study of EM. In particular, our experimental results show that the EM
algorithm can attain the lower bound as the number of tasks grows, while the
algorithm is also successful in competing with its alternatives when used in a
representation learning context.
Related papers
- Reducing Variance in Meta-Learning via Laplace Approximation for Regression Tasks [23.33263252557512]
We address the problem of variance reduction in gradient-based meta-learning.
We propose a novel approach that reduces the variance of the gradient estimate by weighing each support point individually.
arXiv Detail & Related papers (2024-10-02T12:30:05Z) - Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy.
As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z) - Symmetric Q-learning: Reducing Skewness of Bellman Error in Online
Reinforcement Learning [55.75959755058356]
In deep reinforcement learning, estimating the value function is essential to evaluate the quality of states and actions.
A recent study suggested that the error distribution for training the value function is often skewed because of the properties of the Bellman operator.
We proposed a method called Symmetric Q-learning, in which the synthetic noise generated from a zero-mean distribution is added to the target values to generate a Gaussian error distribution.
arXiv Detail & Related papers (2024-03-12T14:49:19Z) - Revisiting the Robustness of the Minimum Error Entropy Criterion: A
Transfer Learning Case Study [16.07380451502911]
This paper revisits the robustness of the minimum error entropy criterion to deal with non-Gaussian noises.
We investigate its feasibility and usefulness in real-life transfer learning regression tasks, where distributional shifts are common.
arXiv Detail & Related papers (2023-07-17T15:38:11Z) - Multi-Environment Meta-Learning in Stochastic Linear Bandits [49.387421094105136]
We consider the feasibility of meta-learning when task parameters are drawn from a mixture distribution instead of a single environment.
We propose a regularized version of the OFUL algorithm that achieves low regret on a new task without requiring knowledge of the environment from which the new task originates.
arXiv Detail & Related papers (2022-05-12T19:31:28Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - Energy-Efficient and Federated Meta-Learning via Projected Stochastic
Gradient Ascent [79.58680275615752]
We propose an energy-efficient federated meta-learning framework.
We assume each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model.
arXiv Detail & Related papers (2021-05-31T08:15:44Z) - Minimum-Delay Adaptation in Non-Stationary Reinforcement Learning via
Online High-Confidence Change-Point Detection [7.685002911021767]
We introduce an algorithm that efficiently learns policies in non-stationary environments.
It analyzes a possibly infinite stream of data and computes, in real-time, high-confidence change-point detection statistics.
We show that (i) this algorithm minimizes the delay until unforeseen changes to a context are detected, thereby allowing for rapid responses.
arXiv Detail & Related papers (2021-05-20T01:57:52Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Meta Learning as Bayes Risk Minimization [18.76745359031975]
We use a probabilistic framework to formalize what it means for two tasks to be related.
In our formulation, the BRM optimal solution is given by the predictive distribution computed from the posterior distribution of the task-specific latent variable conditioned on the contextual dataset.
We show that our approximation of the posterior distributions converges to the maximum likelihood estimate with the same rate as the true posterior distribution.
arXiv Detail & Related papers (2020-06-02T09:38:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.