Related papers: Trajectory-Based Multi-Objective Hyperparameter Optimization for Model Retraining

Trajectory-Based Multi-Objective Hyperparameter Optimization for Model Retraining

URL: http://arxiv.org/abs/2405.15303v1
Date: Fri, 24 May 2024 07:43:45 GMT
Title: Trajectory-Based Multi-Objective Hyperparameter Optimization for Model Retraining
Authors: Wenyu Wang, Zheyi Fan, Szu Hui Ng,
Abstract summary: We present a novel trajectory-based multi-objective Bayesian optimization algorithm. Our algorithm outperforms the state-of-the-art multi-objectives in both locating better trade-offs and tuning efficiency.
Score: 8.598456741786801
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Training machine learning models inherently involves a resource-intensive and noisy iterative learning procedure that allows epoch-wise monitoring of the model performance. However, in multi-objective hyperparameter optimization scenarios, the insights gained from the iterative learning procedure typically remain underutilized. We notice that tracking the model performance across multiple epochs under a hyperparameter setting creates a trajectory in the objective space and that trade-offs along the trajectories are often overlooked despite their potential to offer valuable insights to decision-making for model retraining. Therefore, in this study, we propose to enhance the multi-objective hyperparameter optimization problem by having training epochs as an additional decision variable to incorporate trajectory information. Correspondingly, we present a novel trajectory-based multi-objective Bayesian optimization algorithm characterized by two features: 1) an acquisition function that captures the improvement made by the predictive trajectory of any hyperparameter setting and 2) a multi-objective early stopping mechanism that determines when to terminate the trajectory to maximize epoch efficiency. Numerical experiments on diverse synthetic simulations and hyperparameter tuning benchmarks indicate that our algorithm outperforms the state-of-the-art multi-objective optimizers in both locating better trade-offs and tuning efficiency.

Related papers

Decoupled Multi-Predictor Optimization for Inference-Efficient Model Tuning [59.27124079347153]
Early exiting in conjunction with multi-stage predictors offers a straightforward way to achieve an inference-efficient model.<n>How can early stages provide low-level fundamental features to deep stages while simultaneously supplying high-level discriminative features to early-stage predictors?<n>We propose a Decoupled Multi-Predictor Optimization (DMPO) method to effectively decouple the low-level representative ability and high-level discriminative ability in early stages.
arXiv Detail & Related papers (2025-11-05T07:16:49Z)
Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging [75.93960998357812]
Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their capabilities across different tasks and domains.<n>Current model merging techniques focus on merging all available models simultaneously, with weight matrices-based methods being the predominant approaches.<n>We propose a training-free projection-based continual merging method that processes models sequentially.
arXiv Detail & Related papers (2025-01-16T13:17:24Z)
Model Fusion through Bayesian Optimization in Language Model Fine-Tuning [16.86812534268461]
Fine-tuning pre-trained models for downstream tasks is a widely adopted technique known for its adaptability and reliability across various domains. We introduce a novel model fusion technique that optimize both the desired metric and loss through multi-objective Bayesian optimization. Experiments across various downstream tasks show considerable performance improvements using our Bayesian optimization-guided method.
arXiv Detail & Related papers (2024-11-11T04:36:58Z)
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z)
Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation. In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales. Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z)
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality. This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives. It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z)
DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning [19.84386060857712]
This paper introduces DiffTORI, which utilizes Differentiable Trajectory optimization as the policy representation to generate actions for deep Reinforcement and Imitation learning. Across 15 model-based RL tasks and 35 imitation learning tasks with high-dimensional image and point cloud inputs, DiffTORI outperforms prior state-of-the-art methods in both domains.
arXiv Detail & Related papers (2024-02-08T05:26:40Z)
When Parameter-efficient Tuning Meets General-purpose Vision-language Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique. Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z)
A Survey on Multi-Objective based Parameter Optimization for Deep Learning [1.3223682837381137]
We focus on exploring the effectiveness of multi-objective optimization strategies for parameter optimization in conjunction with deep neural networks. The two methods are combined to provide valuable insights into the generation of predictions and analysis in multiple applications.
arXiv Detail & Related papers (2023-05-17T07:48:54Z)
Optimizing Hyperparameters with Conformal Quantile Regression [7.316604052864345]
We propose to leverage conformalized quantile regression which makes minimal assumptions about the observation noise. This translates to quicker HPO convergence on empirical benchmarks.
arXiv Detail & Related papers (2023-05-05T15:33:39Z)
Gradient-Based Trajectory Optimization With Learned Dynamics [80.41791191022139]
We use machine learning techniques to learn a differentiable dynamics model of the system from data. We show that a neural network can model highly nonlinear behaviors accurately for large time horizons. In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car.
arXiv Detail & Related papers (2022-04-09T22:07:34Z)
Adaptive Sampling of Pareto Frontiers with Binary Constraints Using Regression and Classification [0.0]
We present a novel adaptive optimization algorithm for black-box multi-objective optimization problems with binary constraints. Our method is based on probabilistic regression and classification models, which act as a surrogate for the optimization goals. We also present a novel ellipsoid truncation method to speed up the expected hypervolume calculation.
arXiv Detail & Related papers (2020-08-27T09:15:02Z)
Bayesian Optimization for Selecting Efficient Machine Learning Models [53.202224677485525]
We present a unified Bayesian Optimization framework for jointly optimizing models for both prediction effectiveness and training efficiency. Experiments on model selection for recommendation tasks indicate models selected this way significantly improves model training efficiency.
arXiv Detail & Related papers (2020-08-02T02:56:30Z)
Automatically Learning Compact Quality-aware Surrogates for Optimization Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values. Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making. We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z)
Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling [34.67740033646052]
We improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures. By learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
arXiv Detail & Related papers (2020-02-23T15:52:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.