Transfer Learning in $\ell_1$ Regularized Regression: Hyperparameter
Selection Strategy based on Sharp Asymptotic Analysis
- URL: http://arxiv.org/abs/2409.17704v1
- Date: Thu, 26 Sep 2024 10:20:59 GMT
- Title: Transfer Learning in $\ell_1$ Regularized Regression: Hyperparameter
Selection Strategy based on Sharp Asymptotic Analysis
- Authors: Koki Okajima and Tomoyuki Obuchi
- Abstract summary: Transfer learning techniques aim to leverage information from multiple related datasets to enhance prediction quality against a target dataset.
Some Lasso-based algorithms have been invented: Trans-Lasso and Pretraining Lasso.
We conduct a thorough, precise study of the algorithm in a high-dimensional setting via an analysis using the replica method.
Our approach reveals a surprisingly simple behavior of the algorithm: Ignoring one of the two types of information transferred to the fine-tuning stage has little effect on generalization performance.
- Score: 4.178980693837599
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer learning techniques aim to leverage information from multiple
related datasets to enhance prediction quality against a target dataset. Such
methods have been adopted in the context of high-dimensional sparse regression,
and some Lasso-based algorithms have been invented: Trans-Lasso and Pretraining
Lasso are such examples. These algorithms require the statistician to select
hyperparameters that control the extent and type of information transfer from
related datasets. However, selection strategies for these hyperparameters, as
well as the impact of these choices on the algorithm's performance, have been
largely unexplored. To address this, we conduct a thorough, precise study of
the algorithm in a high-dimensional setting via an asymptotic analysis using
the replica method. Our approach reveals a surprisingly simple behavior of the
algorithm: Ignoring one of the two types of information transferred to the
fine-tuning stage has little effect on generalization performance, implying
that efforts for hyperparameter selection can be significantly reduced. Our
theoretical findings are also empirically supported by real-world applications
on the IMDb dataset.
Related papers
- Transfer Learning for Nonparametric Regression: Non-asymptotic Minimax
Analysis and Adaptive Procedure [5.303044915173525]
We develop a novel estimator called the confidence thresholding estimator, which is shown to achieve the minimax optimal risk up to a logarithmic factor.
We then propose a data-driven algorithm that adaptively achieves the minimax risk up to a logarithmic factor across a wide range of parameter spaces.
arXiv Detail & Related papers (2024-01-22T16:24:04Z) - Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Nonlinear Feature Aggregation: Two Algorithms driven by Theory [45.3190496371625]
Real-world machine learning applications are characterized by a huge number of features, leading to computational and memory issues.
We propose a dimensionality reduction algorithm (NonLinCFA) which aggregates non-linear transformations of features with a generic aggregation function.
We also test the algorithms on synthetic and real-world datasets, performing regression and classification tasks, showing competitive performances.
arXiv Detail & Related papers (2023-06-19T19:57:33Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive
Least-Squares [8.443742714362521]
We develop an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints.
Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA)
Our experiments show the effectiveness of the proposed method compared to the baselines.
arXiv Detail & Related papers (2022-07-28T02:01:31Z) - Automatic tuning of hyper-parameters of reinforcement learning
algorithms using Bayesian optimization with behavioral cloning [0.0]
In reinforcement learning (RL), the information content of data gathered by the learning agent is dependent on the setting of many hyper- parameters.
In this work, a novel approach for autonomous hyper- parameter setting using Bayesian optimization is proposed.
Experiments reveal promising results compared to other manual tweaking and optimization-based approaches.
arXiv Detail & Related papers (2021-12-15T13:10:44Z) - Ensemble Wrapper Subsampling for Deep Modulation Classification [70.91089216571035]
Subsampling of received wireless signals is important for relaxing hardware requirements as well as the computational cost of signal processing algorithms.
We propose a subsampling technique to facilitate the use of deep learning for automatic modulation classification in wireless communication systems.
arXiv Detail & Related papers (2020-05-10T06:11:13Z) - Implicit differentiation of Lasso-type models for hyperparameter
optimization [82.73138686390514]
We introduce an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems.
Our approach scales to high-dimensional data by leveraging the sparsity of the solutions.
arXiv Detail & Related papers (2020-02-20T18:43:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.