Derandomizing Multi-Distribution Learning
- URL: http://arxiv.org/abs/2409.17567v1
- Date: Thu, 26 Sep 2024 06:28:56 GMT
- Title: Derandomizing Multi-Distribution Learning
- Authors: Kasper Green Larsen, Omar Montasser, Nikita Zhivotovskiy
- Abstract summary: Multi-distribution learning involves learning a single predictor that works well across multiple data distributions.
Recent research has shown near-optimal sample complexity achieved with oracle efficient algorithms.
This raises the question: can these algorithms be derandomized to produce a deterministic predictor for multiple distributions?
- Score: 28.514129340758938
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-distribution or collaborative learning involves learning a single
predictor that works well across multiple data distributions, using samples
from each during training. Recent research on multi-distribution learning,
focusing on binary loss and finite VC dimension classes, has shown near-optimal
sample complexity that is achieved with oracle efficient algorithms. That is,
these algorithms are computationally efficient given an efficient ERM for the
class. Unlike in classical PAC learning, where the optimal sample complexity is
achieved with deterministic predictors, current multi-distribution learning
algorithms output randomized predictors. This raises the question: can these
algorithms be derandomized to produce a deterministic predictor for multiple
distributions? Through a reduction to discrepancy minimization, we show that
derandomizing multi-distribution learning is computationally hard, even when
ERM is computationally efficient. On the positive side, we identify a
structural condition enabling an efficient black-box reduction, converting
existing randomized multi-distribution predictors into deterministic ones.
Related papers
- Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Collaborative Learning with Different Labeling Functions [7.228285747845779]
We study a variant of Collaborative PAC Learning, in which we aim to learn an accurate classifier for each of the $n$ data distributions.
We show that, when the data distributions satisfy a weaker realizability assumption, sample-efficient learning is still feasible.
arXiv Detail & Related papers (2024-02-16T04:32:22Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Quantifying Inherent Randomness in Machine Learning Algorithms [7.591218883378448]
This paper uses an empirical study to examine the effects of randomness in model training and randomness in the partitioning of a dataset into training and test subsets.
We quantify and compare the magnitude of the variation in predictive performance for the following ML algorithms: Random Forests (RFs), Gradient Boosting Machines (GBMs), and Feedforward Neural Networks (FFNNs)
arXiv Detail & Related papers (2022-06-24T15:49:52Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Expectation propagation on the diluted Bayesian classifier [0.0]
We introduce a statistical mechanics inspired strategy that addresses the problem of sparse feature selection in the context of binary classification.
A computational scheme known as expectation propagation (EP) is used to train a continuous-weights perceptron learning a classification rule.
EP is a robust and competitive algorithm in terms of variable selection properties, estimation accuracy and computational complexity.
arXiv Detail & Related papers (2020-09-20T23:59:44Z) - Differentially Private ADMM for Convex Distributed Learning: Improved
Accuracy via Multi-Step Approximation [10.742065340992525]
Alternating Direction Method of Multipliers (ADMM) is a popular computation for distributed learning.
When the training data is sensitive, the exchanged iterates will cause serious privacy concern.
We propose a new differentially private distributed ADMM with improved accuracy for a wide range of convex learning problems.
arXiv Detail & Related papers (2020-05-16T07:17:31Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.