Combined Pruning for Nested Cross-Validation to Accelerate Automated
Hyperparameter Optimization for Embedded Feature Selection in
High-Dimensional Data with Very Small Sample Sizes
- URL: http://arxiv.org/abs/2202.00598v1
- Date: Tue, 1 Feb 2022 17:42:37 GMT
- Title: Combined Pruning for Nested Cross-Validation to Accelerate Automated
Hyperparameter Optimization for Embedded Feature Selection in
High-Dimensional Data with Very Small Sample Sizes
- Authors: Sigrun May, Sven Hartmann and Frank Klawonn
- Abstract summary: Tree-based embedded feature selection to exclude irrelevant features in high-dimensional data with very small sample sizes requires optimized hyperparameters for the model building process.
Standard pruning algorithms must prune late or risk aborting calculations due to high variance in the performance evaluation metric.
We adapt the usage of a state-of-the-art successive halving pruner and combine it with two new pruning strategies based on domain or prior knowledge.
Our proposed combined three-layer pruner keeps promising trials while reducing the number of models to be built by up to 81,3% compared to using a state-of-the-
- Score: 3.51500332842165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Applying tree-based embedded feature selection to exclude irrelevant features
in high-dimensional data with very small sample sizes requires optimized
hyperparameters for the model building process. In addition, nested
cross-validation must be applied for this type of data to avoid biased model
performance. The resulting long computation time can be accelerated with
pruning. However, standard pruning algorithms must prune late or risk aborting
calculations of promising hyperparameter sets due to high variance in the
performance evaluation metric. To address this, we adapt the usage of a
state-of-the-art successive halving pruner and combine it with two new pruning
strategies based on domain or prior knowledge. One additional pruning strategy
immediately stops the computation of trials with semantically meaningless
results for the selected hyperparameter combinations. The other is an
extrapolating threshold pruning strategy suitable for nested-cross-validation
with high variance. Our proposed combined three-layer pruner keeps promising
trials while reducing the number of models to be built by up to 81,3% compared
to using a state-of-the-art asynchronous successive halving pruner alone. Our
three-layer pruner implementation(available at
https://github.com/sigrun-may/cv-pruner) speeds up data analysis or enables
deeper hyperparameter search within the same computation time. It consequently
saves time, money and energy, reducing the CO2 footprint.
Related papers
- ETS: Efficient Tree Search for Inference-Time Scaling [61.553681244572914]
One promising approach for test-time compute scaling is search against a process reward model.
diversity of trajectories in the tree search process affects the accuracy of the search, since increasing diversity promotes more exploration.
We propose Efficient Tree Search (ETS), which promotes KV sharing by pruning redundant trajectories while maintaining necessary diverse trajectories.
arXiv Detail & Related papers (2025-02-19T09:30:38Z) - Value-Based Deep RL Scales Predictably [100.21834069400023]
We show that value-based off-policy RL methods are predictable despite community lore regarding their pathological behavior.
We validate our approach using three algorithms: SAC, BRO, and PQL on DeepMind Control, OpenAI gym, and IsaacGym.
arXiv Detail & Related papers (2025-02-06T18:59:47Z) - ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.
Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z) - TT-MPD: Test Time Model Pruning and Distillation [3.675015670568961]
Pruning can be an effective method of compressing large pre-trained models for inference speed acceleration.
Previous pruning approaches rely on access to the original training dataset for both pruning and subsequent fine-tuning.
We propose an efficient pruning method that considers the approximated finetuned accuracy and potential inference latency saving.
arXiv Detail & Related papers (2024-12-10T02:05:13Z) - Stability-Adjusted Cross-Validation for Sparse Linear Regression [5.156484100374059]
Cross-validation techniques like k-fold cross-validation substantially increase the computational cost of sparse regression.
We propose selecting hyper parameters that minimize a weighted sum of a cross-validation metric and a model's output stability.
Our confidence adjustment procedure reduces test set error by 2%, on average, on 13 real-world datasets.
arXiv Detail & Related papers (2023-06-26T17:02:45Z) - Tune As You Scale: Hyperparameter Optimization For Compute Efficient
Training [0.0]
We propose a practical method for robustly tuning large models.
CarBS performs local search around the performance-cost frontier.
Among our results, we effectively solve the entire ProcGen benchmark just by tuning a simple baseline.
arXiv Detail & Related papers (2023-06-13T18:22:24Z) - Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges.
Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning.
A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z) - Fast Hyperparameter Tuning for Ising Machines [0.8057006406834467]
"FastConvergence" is a convergence acceleration method for Tree-structured Parzen Estimator (TPE)
For experiments, well-known Travel Salesman Problem (TSP) and Quadratic Assignment Problem (QAP) instances are used as input.
Results show, FastConvergence can reach similar results to TPE alone within less than half the number of trials.
arXiv Detail & Related papers (2022-11-29T01:53:31Z) - STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization [74.1615979057429]
We investigate non-batch optimization problems where the objective is an expectation over smooth loss functions.
Our work builds on the STORM algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters.
arXiv Detail & Related papers (2021-11-01T15:43:36Z) - DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic
Convolution [136.7261709896713]
We propose a data-driven approach that generates the appropriate convolution kernels to apply in response to the nature of the instances.
The proposed method achieves promising results on both ScanetNetV2 and S3DIS.
It also improves inference speed by more than 25% over the current state-of-the-art.
arXiv Detail & Related papers (2020-11-26T14:56:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.