HyperTime: Hyperparameter Optimization for Combating Temporal
Distribution Shifts
- URL: http://arxiv.org/abs/2305.18421v1
- Date: Sun, 28 May 2023 19:41:23 GMT
- Title: HyperTime: Hyperparameter Optimization for Combating Temporal
Distribution Shifts
- Authors: Shaokun Zhang, Yiran Wu, Zhonghua Zheng, Qingyun Wu, Chi Wang
- Abstract summary: We use the lexicographic priority order on average validation loss and worst-case validation loss over chronological validation sets.
We show the strong empirical performance of the proposed method on multiple machine learning tasks with temporal distribution shifts.
- Score: 26.205660967039087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we propose a hyperparameter optimization method named
\emph{HyperTime} to find hyperparameters robust to potential temporal
distribution shifts in the unseen test data. Our work is motivated by an
important observation that it is, in many cases, possible to achieve temporally
robust predictive performance via hyperparameter optimization. Based on this
observation, we leverage the `worst-case-oriented' philosophy from the robust
optimization literature to help find such robust hyperparameter configurations.
HyperTime imposes a lexicographic priority order on average validation loss and
worst-case validation loss over chronological validation sets. We perform a
theoretical analysis on the upper bound of the expected test loss, which
reveals the unique advantages of our approach. We also demonstrate the strong
empirical performance of the proposed method on multiple machine learning tasks
with temporal distribution shifts.
Related papers
- Fine-Tuning Adaptive Stochastic Optimizers: Determining the Optimal Hyperparameter $ε$ via Gradient Magnitude Histogram Analysis [0.7366405857677226]
We introduce a new framework based on the empirical probability density function of the loss's magnitude, termed the "gradient magnitude histogram"
We propose a novel algorithm using gradient magnitude histograms to automatically estimate a refined and accurate search space for the optimal safeguard.
arXiv Detail & Related papers (2023-11-20T04:34:19Z) - Improving Fast Minimum-Norm Attacks with Hyperparameter Optimization [12.526318578195724]
We show that hyperparameter optimization can improve fast minimum-norm attacks by the selection of the loss function, the automation and the step-size scheduler.
We release our open-source code at https://www.pralab.com/HO-FMN.
arXiv Detail & Related papers (2023-10-12T10:03:25Z) - PriorBand: Practical Hyperparameter Optimization in the Age of Deep
Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines.
We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z) - Hierarchical Proxy Modeling for Improved HPO in Time Series Forecasting [9.906423777470737]
We propose a novel technique, H-Pro, to drive HPO via test proxies by exploiting data hierarchies associated with time series datasets.
H-Pro can be applied on any off-the-shelf machine learning model to perform HPO.
Our approach outperforms existing state-of-the-art methods in Tourism, Wiki, and Traffic datasets.
arXiv Detail & Related papers (2022-11-28T06:37:15Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Hyper-parameter optimization based on soft actor critic and hierarchical
mixture regularization [5.063728016437489]
We model hyper- parameter optimization process as a Markov decision process, and tackle it with reinforcement learning.
A novel hyper- parameter optimization method based on soft actor critic and hierarchical mixture regularization has been proposed.
arXiv Detail & Related papers (2021-12-08T02:34:43Z) - Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for
Hyperparameter Recommendation [83.85021205445662]
We propose an instantiation--amortized auto-tuning (AT2) to speed up tuning of machine learning models.
We conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2)
arXiv Detail & Related papers (2021-06-17T00:01:18Z) - Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm [97.66038345864095]
We propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG)
Specifically, we first formulate hyperparameter optimization as an A-based constrained optimization problem.
Then, we use the average zeroth-order hyper-gradients to update hyper parameters.
arXiv Detail & Related papers (2021-02-17T21:03:05Z) - An Asymptotically Optimal Multi-Armed Bandit Algorithm and
Hyperparameter Optimization [48.5614138038673]
We propose an efficient and robust bandit-based algorithm called Sub-Sampling (SS) in the scenario of hyper parameter search evaluation.
We also develop a novel hyper parameter optimization algorithm called BOSS.
Empirical studies validate our theoretical arguments of SS and demonstrate the superior performance of BOSS on a number of applications.
arXiv Detail & Related papers (2020-07-11T03:15:21Z) - Automatic Hyper-Parameter Optimization Based on Mapping Discovery from
Data to Hyper-Parameters [3.37314595161109]
We propose an efficient automatic parameter optimization approach, which is based on the mapping from data to the corresponding hyper- parameters.
We show that the proposed approaches outperform the state-of-the-art apporaches significantly.
arXiv Detail & Related papers (2020-03-03T19:26:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.