Transfer-Learning Across Datasets with Different Input Dimensions: An
Algorithm and Analysis for the Linear Regression Case
- URL: http://arxiv.org/abs/2202.05069v4
- Date: Mon, 6 Nov 2023 10:55:54 GMT
- Title: Transfer-Learning Across Datasets with Different Input Dimensions: An
Algorithm and Analysis for the Linear Regression Case
- Authors: Luis Pedro Silvestrin, Harry van Zanten, Mark Hoogendoorn, Ger Koole
- Abstract summary: We propose a transfer learning algorithm that combines new and historical data with different input dimensions.
Our approach achieves state-of-the-art performance on 9 real-life datasets.
- Score: 7.674023644408741
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the development of new sensors and monitoring devices, more sources of
data become available to be used as inputs for machine learning models. These
can on the one hand help to improve the accuracy of a model. On the other hand,
combining these new inputs with historical data remains a challenge that has
not yet been studied in enough detail. In this work, we propose a transfer
learning algorithm that combines new and historical data with different input
dimensions. This approach is easy to implement, efficient, with computational
complexity equivalent to the ordinary least-squares method, and requires no
hyperparameter tuning, making it straightforward to apply when the new data is
limited. Different from other approaches, we provide a rigorous theoretical
study of its robustness, showing that it cannot be outperformed by a baseline
that utilizes only the new data. Our approach achieves state-of-the-art
performance on 9 real-life datasets, outperforming the linear DSFT, another
linear transfer learning algorithm, and performing comparably to non-linear
DSFT.
Related papers
- Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws [59.03420759554073]
We introduce Adaptive Data Optimization (ADO), an algorithm that optimize data distributions in an online fashion, concurrent with model training.
ADO does not require external knowledge, proxy models, or modifications to the model update.
ADO uses per-domain scaling laws to estimate the learning potential of each domain during training and adjusts the data mixture accordingly.
arXiv Detail & Related papers (2024-10-15T17:47:44Z) - Simple Ingredients for Offline Reinforcement Learning [86.1988266277766]
offline reinforcement learning algorithms have proven effective on datasets highly connected to the target downstream task.
We show that existing methods struggle with diverse data: their performance considerably deteriorates as data collected for related but different tasks is simply added to the offline buffer.
We show that scale, more than algorithmic considerations, is the key factor influencing performance.
arXiv Detail & Related papers (2024-03-19T18:57:53Z) - Contrastive Left-Right Wearable Sensors (IMUs) Consistency Matching for
HAR [0.0]
We show how real data can be used for self-supervised learning without any transformations.
Our approach involves contrastive matching of two different sensors.
We test our approach on the Opportunity and MM-Fit datasets.
arXiv Detail & Related papers (2023-11-21T15:31:16Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - LAVA: Data Valuation without Pre-Specified Learning Algorithms [20.578106028270607]
We introduce a new framework that can value training data in a way that is oblivious to the downstream learning algorithm.
We develop a proxy for the validation performance associated with a training set based on a non-conventional class-wise Wasserstein distance between training and validation sets.
We show that the distance characterizes the upper bound of the validation performance for any given model under certain Lipschitz conditions.
arXiv Detail & Related papers (2023-04-28T19:05:16Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive
Least-Squares [8.443742714362521]
We develop an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints.
Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA)
Our experiments show the effectiveness of the proposed method compared to the baselines.
arXiv Detail & Related papers (2022-07-28T02:01:31Z) - Learning new physics efficiently with nonparametric methods [11.970219534238444]
We present a machine learning approach for model-independent new physics searches.
The corresponding algorithm is powered by recent large-scale implementations of kernel methods.
We show that our approach has dramatic advantages compared to neural network implementations in terms of training times and computational resources.
arXiv Detail & Related papers (2022-04-05T16:17:59Z) - Learning ODE Models with Qualitative Structure Using Gaussian Processes [0.6882042556551611]
In many contexts explicit data collection is expensive and learning algorithms must be data-efficient to be feasible.
We propose an approach to learning a vector field of differential equations using sparse Gaussian Processes.
We show that this combination improves extrapolation performance and long-term behaviour significantly, while also reducing the computational cost.
arXiv Detail & Related papers (2020-11-10T19:34:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.