Learning with Subset Stacking
- URL: http://arxiv.org/abs/2112.06251v3
- Date: Mon, 30 Oct 2023 12:56:34 GMT
- Title: Learning with Subset Stacking
- Authors: S. \.Ilker Birbil, Sinan Yildirim, Kaya G\"okalp, M. Hakan Aky\"uz
- Abstract summary: We propose a new regression algorithm that learns from a set of input-output pairs.
We call this algorithm LEarning with Subset Stacking'' or LESS, due to its resemblance to the method of stacking regressors.
- Score: 0.40964539027092906
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new regression algorithm that learns from a set of input-output
pairs. Our algorithm is designed for populations where the relation between the
input variables and the output variable exhibits a heterogeneous behavior
across the predictor space. The algorithm starts with generating subsets that
are concentrated around random points in the input space. This is followed by
training a local predictor for each subset. Those predictors are then combined
in a novel way to yield an overall predictor. We call this algorithm ``LEarning
with Subset Stacking'' or LESS, due to its resemblance to the method of
stacking regressors. We compare the testing performance of LESS with
state-of-the-art methods on several datasets. Our comparison shows that LESS is
a competitive supervised learning method. Moreover, we observe that LESS is
also efficient in terms of computation time and it allows a straightforward
parallel implementation.
Related papers
- Derandomizing Multi-Distribution Learning [28.514129340758938]
Multi-distribution learning involves learning a single predictor that works well across multiple data distributions.
Recent research has shown near-optimal sample complexity achieved with oracle efficient algorithms.
This raises the question: can these algorithms be derandomized to produce a deterministic predictor for multiple distributions?
arXiv Detail & Related papers (2024-09-26T06:28:56Z) - Efficient Discrepancy Testing for Learning with Distribution Shift [17.472049019016524]
We provide the first set of provably efficient algorithms for testing localized discrepancy distance.
Results imply a broad set of new, efficient learning algorithms in the recently introduced model of Testable Learning with Distribution Shift.
arXiv Detail & Related papers (2024-06-13T17:51:10Z) - A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation [121.0693322732454]
Contrastive Language-Image Pretraining (CLIP) has gained popularity for its remarkable zero-shot capacity.
Recent research has focused on developing efficient fine-tuning methods to enhance CLIP's performance in downstream tasks.
We revisit a classical algorithm, Gaussian Discriminant Analysis (GDA), and apply it to the downstream classification of CLIP.
arXiv Detail & Related papers (2024-02-06T15:45:27Z) - Distributive Pre-Training of Generative Modeling Using Matrix-Product
States [0.0]
We consider an alternative training scheme utilizing basic tensor network operations, e.g., summation and compression.
The training algorithm is based on compressing the superposition state constructed from all the training data in product state representation.
We benchmark the algorithm on the MNIST dataset and show reasonable results for generating new images and classification tasks.
arXiv Detail & Related papers (2023-06-26T15:46:08Z) - Efficient and Local Parallel Random Walks [21.29022677162416]
Random walks are a fundamental primitive used in many machine learning algorithms.
We present a new algorithm that overcomes this limitation by building random walk efficiently and locally.
We show that our technique is both memory and round efficient, and in particular yields an efficient parallel local clustering algorithm.
arXiv Detail & Related papers (2021-12-01T17:06:11Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - Predict then Interpolate: A Simple Algorithm to Learn Stable Classifiers [59.06169363181417]
Predict then Interpolate (PI) is an algorithm for learning correlations that are stable across environments.
We prove that by interpolating the distributions of the correct predictions and the wrong predictions, we can uncover an oracle distribution where the unstable correlation vanishes.
arXiv Detail & Related papers (2021-05-26T15:37:48Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural
Networks [47.52214243454995]
A popular strategy to train recurrent neural networks (RNNs) is to take the ground truth as input at each time step.
We propose a fully differentiable training algorithm for RNNs to better capture long-term dependencies.
arXiv Detail & Related papers (2021-03-22T06:15:22Z) - Meta-learning with Stochastic Linear Bandits [120.43000970418939]
We consider a class of bandit algorithms that implement a regularized version of the well-known OFUL algorithm, where the regularization is a square euclidean distance to a bias vector.
We show both theoretically and experimentally, that when the number of tasks grows and the variance of the task-distribution is small, our strategies have a significant advantage over learning the tasks in isolation.
arXiv Detail & Related papers (2020-05-18T08:41:39Z) - Accelerating Feedforward Computation via Parallel Nonlinear Equation
Solving [106.63673243937492]
Feedforward computation, such as evaluating a neural network or sampling from an autoregressive model, is ubiquitous in machine learning.
We frame the task of feedforward computation as solving a system of nonlinear equations. We then propose to find the solution using a Jacobi or Gauss-Seidel fixed-point method, as well as hybrid methods of both.
Our method is guaranteed to give exactly the same values as the original feedforward computation with a reduced (or equal) number of parallelizable iterations, and hence reduced time given sufficient parallel computing power.
arXiv Detail & Related papers (2020-02-10T10:11:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.