Related papers: Neural Model-based Optimization with Right-Censored Observations

Neural Model-based Optimization with Right-Censored Observations

URL: http://arxiv.org/abs/2009.13828v1
Date: Tue, 29 Sep 2020 07:32:30 GMT
Title: Neural Model-based Optimization with Right-Censored Observations
Authors: Katharina Eggensperger, Kai Haase, Philipp M\"uller, Marius Lindauer and Frank Hutter
Abstract summary: Neural networks (NNs) have been demonstrated to work well at the core of model-based optimization procedures. We show that our trained regression models achieve a better predictive quality than several baselines.
Score: 42.530925002607376
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In many fields of study, we only observe lower bounds on the true response value of some experiments. When fitting a regression model to predict the distribution of the outcomes, we cannot simply drop these right-censored observations, but need to properly model them. In this work, we focus on the concept of censored data in the light of model-based optimization where prematurely terminating evaluations (and thus generating right-censored data) is a key factor for efficiency, e.g., when searching for an algorithm configuration that minimizes runtime of the algorithm at hand. Neural networks (NNs) have been demonstrated to work well at the core of model-based optimization procedures and here we extend them to handle these censored observations. We propose (i)~a loss function based on the Tobit model to incorporate censored samples into training and (ii) use an ensemble of networks to model the posterior distribution. To nevertheless be efficient in terms of optimization-overhead, we propose to use Thompson sampling s.t. we only need to train a single NN in each iteration. Our experiments show that our trained regression models achieve a better predictive quality than several baselines and that our approach achieves new state-of-the-art performance for model-based optimization on two optimization problems: minimizing the solution time of a SAT solver and the time-to-accuracy of neural networks.

Related papers

Diffusion Models as Network Optimizers: Explorations and Analysis [71.69869025878856]
generative diffusion models (GDMs) have emerged as a promising new approach to network optimization. In this study, we first explore the intrinsic characteristics of generative models. We provide a concise theoretical and intuitive demonstration of the advantages of generative models over discriminative network optimization.
arXiv Detail & Related papers (2024-11-01T09:05:47Z)
The Unreasonable Effectiveness of Solving Inverse Problems with Neural Networks [24.766470360665647]
We show that neural networks trained to learn solutions to inverse problems can find better solutions than classicals even on their training set. Our findings suggest an alternative use for neural networks: rather than generalizing to new data for fast inference, they can also be used to find better solutions on known data.
arXiv Detail & Related papers (2024-08-15T12:38:10Z)
Explicit Foundation Model Optimization with Self-Attentive Feed-Forward Neural Units [4.807347156077897]
Iterative approximation methods using backpropagation enable the optimization of neural networks, but they remain computationally expensive when used at scale. This paper presents an efficient alternative for optimizing neural networks that reduces the costs of scaling neural networks and provides high-efficiency optimizations for low-resource applications.
arXiv Detail & Related papers (2023-11-13T17:55:07Z)
Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL) We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking. We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z)
Adaptive Sparse Gaussian Process [0.0]
We propose the first adaptive sparse Gaussian Process (GP) able to address all these issues. We first reformulate a variational sparse GP algorithm to make it adaptive through a forgetting factor. We then propose updating a single inducing point of the sparse GP model together with the remaining model parameters every time a new sample arrives.
arXiv Detail & Related papers (2023-02-20T21:34:36Z)
Censored Quantile Regression Neural Networks [24.118509578363593]
This paper considers doing quantile regression on censored data using neural networks (NNs) We show how an algorithm popular in linear models can be applied to NNs. Our major contribution is a novel algorithm that simultaneously optimises a grid of quantiles output by a single NN.
arXiv Detail & Related papers (2022-05-26T17:10:28Z)
RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries. A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function. Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z)
MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness. Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions. We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z)
Bayes DistNet -- A Robust Neural Network for Algorithm Runtime Distribution Predictions [1.8275108630751844]
Randomized algorithms are used in many state-of-the-art solvers for constraint satisfaction problems (CSP) and Boolean satisfiability (SAT) problems. Previous state-of-the-art methods directly try to predict a fixed parametric distribution that the input instance follows. This new model achieves robust predictive performance in the low observation setting, as well as handling censored observations.
arXiv Detail & Related papers (2020-12-14T01:15:39Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.