Related papers: Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning

Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning

URL: http://arxiv.org/abs/2001.03898v3
Date: Fri, 14 Feb 2020 11:46:09 GMT
Title: Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning
Authors: Yao Zhang, Daniel Jarrett, Mihaela van der Schaar
Abstract summary: We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting. In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions. We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
Score: 100.83444258562263
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: An essential problem in automated machine learning (AutoML) is that of model selection. A unique challenge in the sequential setting is the fact that the optimal model itself may vary over time, depending on the distribution of features and labels available up to each point in time. In this paper, we propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting. This is accomplished by treating the performance at each time step as its own black-box function. In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions using deep kernel learning (DKL). To the best of our knowledge, we are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose. Using multiple real-world datasets, we verify that our proposed method outperforms both standard BO and multi-objective BO algorithms on a variety of sequence prediction tasks.

Related papers

Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z)
A Survey of Meta-features Used for Automated Selection of Algorithms for Black-box Single-objective Continuous Optimization [4.173197621837912]
We conduct an overview of the key contributions to algorithm selection in the field of single-objective continuous black-box optimization. We study machine learning models for automated algorithm selection, configuration, and performance prediction.
arXiv Detail & Related papers (2024-06-08T11:11:14Z)
Reinforced In-Context Black-Box Optimization [64.25546325063272]
RIBBO is a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion. RIBBO employs expressive sequence models to learn the optimization histories produced by multiple behavior algorithms and tasks. Central to our method is to augment the optimization histories with textitregret-to-go tokens, which are designed to represent the performance of an algorithm based on cumulative regret over the future part of the histories.
arXiv Detail & Related papers (2024-02-27T11:32:14Z)
Predictive Modeling through Hyper-Bayesian Optimization [60.586813904500595]
We propose a novel way of integrating model selection and BO for the single goal of reaching the function optima faster. The algorithm moves back and forth between BO in the model space and BO in the function space, where the goodness of the recommended model is captured. In addition to improved sample efficiency, the framework outputs information about the black-box function.
arXiv Detail & Related papers (2023-08-01T04:46:58Z)
DynamoRep: Trajectory-Based Population Dynamics for Classification of Black-box Optimization Problems [0.755972004983746]
We propose a feature extraction method that describes the trajectories of optimization algorithms using simple statistics. We demonstrate that the proposed DynamoRep features capture enough information to identify the problem class on which the optimization algorithm is running.
arXiv Detail & Related papers (2023-06-08T06:57:07Z)
Bayesian Optimization for Macro Placement [48.55456716632735]
We develop a novel approach to macro placement using Bayesian optimization (BO) over sequence pairs. BO is a machine learning technique that uses a probabilistic surrogate model and an acquisition function. We demonstrate our algorithm on the fixed-outline macro placement problem with the half-perimeter wire length objective.
arXiv Detail & Related papers (2022-07-18T06:17:06Z)
RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries. A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function. Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z)
Meta Learning Black-Box Population-Based Optimizers [0.0]
We propose the use of meta-learning to infer population-based blackbox generalizations. We show that the meta-loss function encourages a learned algorithm to alter its search behavior so that it can easily fit into a new context.
arXiv Detail & Related papers (2021-03-05T08:13:25Z)
Automatic Tuning of Stochastic Gradient Descent with Bayesian Optimisation [8.340191147575307]
We introduce an original probabilistic model for traces of optimisers, based on latent Gaussian processes and an auto-/regressive formulation. It flexibly adjusts to abrupt changes of behaviours induced by new learning rate values. It is well-suited to tackle a set of problems: first, for the on-line adaptation of the learning rate for a cold-started run; then, for tuning the schedule for a set of similar tasks, as well as warm-starting it for a new task.
arXiv Detail & Related papers (2020-06-25T13:18:18Z)
Landscape-Aware Fixed-Budget Performance Regression and Algorithm Selection for Modular CMA-ES Variants [1.0965065178451106]
We show that it is possible to achieve high-quality performance predictions with off-the-shelf supervised learning approaches. We test this approach on a portfolio of very similar algorithms, which we choose from the family of modular CMA-ES algorithms.
arXiv Detail & Related papers (2020-06-17T13:34:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.