Neural Model-based Optimization with Right-Censored Observations
- URL: http://arxiv.org/abs/2009.13828v1
- Date: Tue, 29 Sep 2020 07:32:30 GMT
- Title: Neural Model-based Optimization with Right-Censored Observations
- Authors: Katharina Eggensperger, Kai Haase, Philipp M\"uller, Marius Lindauer
and Frank Hutter
- Abstract summary: Neural networks (NNs) have been demonstrated to work well at the core of model-based optimization procedures.
We show that our trained regression models achieve a better predictive quality than several baselines.
- Score: 42.530925002607376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many fields of study, we only observe lower bounds on the true response
value of some experiments. When fitting a regression model to predict the
distribution of the outcomes, we cannot simply drop these right-censored
observations, but need to properly model them. In this work, we focus on the
concept of censored data in the light of model-based optimization where
prematurely terminating evaluations (and thus generating right-censored data)
is a key factor for efficiency, e.g., when searching for an algorithm
configuration that minimizes runtime of the algorithm at hand. Neural networks
(NNs) have been demonstrated to work well at the core of model-based
optimization procedures and here we extend them to handle these censored
observations. We propose (i)~a loss function based on the Tobit model to
incorporate censored samples into training and (ii) use an ensemble of networks
to model the posterior distribution. To nevertheless be efficient in terms of
optimization-overhead, we propose to use Thompson sampling s.t. we only need to
train a single NN in each iteration. Our experiments show that our trained
regression models achieve a better predictive quality than several baselines
and that our approach achieves new state-of-the-art performance for model-based
optimization on two optimization problems: minimizing the solution time of a
SAT solver and the time-to-accuracy of neural networks.
Related papers
- Diffusion Models as Network Optimizers: Explorations and Analysis [71.69869025878856]
generative diffusion models (GDMs) have emerged as a promising new approach to network optimization.
In this study, we first explore the intrinsic characteristics of generative models.
We provide a concise theoretical and intuitive demonstration of the advantages of generative models over discriminative network optimization.
arXiv Detail & Related papers (2024-11-01T09:05:47Z) - The Unreasonable Effectiveness of Solving Inverse Problems with Neural Networks [24.766470360665647]
We show that neural networks trained to learn solutions to inverse problems can find better solutions than classicals even on their training set.
Our findings suggest an alternative use for neural networks: rather than generalizing to new data for fast inference, they can also be used to find better solutions on known data.
arXiv Detail & Related papers (2024-08-15T12:38:10Z) - Explicit Foundation Model Optimization with Self-Attentive Feed-Forward
Neural Units [4.807347156077897]
Iterative approximation methods using backpropagation enable the optimization of neural networks, but they remain computationally expensive when used at scale.
This paper presents an efficient alternative for optimizing neural networks that reduces the costs of scaling neural networks and provides high-efficiency optimizations for low-resource applications.
arXiv Detail & Related papers (2023-11-13T17:55:07Z) - Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL)
We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking.
We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z) - Adaptive Sparse Gaussian Process [0.0]
We propose the first adaptive sparse Gaussian Process (GP) able to address all these issues.
We first reformulate a variational sparse GP algorithm to make it adaptive through a forgetting factor.
We then propose updating a single inducing point of the sparse GP model together with the remaining model parameters every time a new sample arrives.
arXiv Detail & Related papers (2023-02-20T21:34:36Z) - Censored Quantile Regression Neural Networks [24.118509578363593]
This paper considers doing quantile regression on censored data using neural networks (NNs)
We show how an algorithm popular in linear models can be applied to NNs.
Our major contribution is a novel algorithm that simultaneously optimises a grid of quantiles output by a single NN.
arXiv Detail & Related papers (2022-05-26T17:10:28Z) - RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries.
A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function.
Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z) - MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness.
Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions.
We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z) - Bayes DistNet -- A Robust Neural Network for Algorithm Runtime
Distribution Predictions [1.8275108630751844]
Randomized algorithms are used in many state-of-the-art solvers for constraint satisfaction problems (CSP) and Boolean satisfiability (SAT) problems.
Previous state-of-the-art methods directly try to predict a fixed parametric distribution that the input instance follows.
This new model achieves robust predictive performance in the low observation setting, as well as handling censored observations.
arXiv Detail & Related papers (2020-12-14T01:15:39Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.