End-to-End Learning of Deep Kernel Acquisition Functions for Bayesian
Optimization
- URL: http://arxiv.org/abs/2111.00639v1
- Date: Mon, 1 Nov 2021 00:42:31 GMT
- Title: End-to-End Learning of Deep Kernel Acquisition Functions for Bayesian
Optimization
- Authors: Tomoharu Iwata
- Abstract summary: We propose a meta-learning method for Bayesian optimization with neural network-based kernels.
Our model is trained by a reinforcement learning framework from multiple tasks.
In experiments using three text document datasets, we demonstrate that the proposed method achieves better BO performance than the existing methods.
- Score: 39.56814839510978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For Bayesian optimization (BO) on high-dimensional data with complex
structure, neural network-based kernels for Gaussian processes (GPs) have been
used to learn flexible surrogate functions by the high representation power of
deep learning. However, existing methods train neural networks by maximizing
the marginal likelihood, which do not directly improve the BO performance. In
this paper, we propose a meta-learning method for BO with neural network-based
kernels that minimizes the expected gap between the true optimum value and the
best value found by BO. We model a policy, which takes the current evaluated
data points as input and outputs the next data point to be evaluated, by a
neural network, where neural network-based kernels, GPs, and mutual
information-based acquisition functions are used as its layers. With our model,
the neural network-based kernel is trained to be appropriate for the
acquisition function by backpropagating the gap through the acquisition
function and GP. Our model is trained by a reinforcement learning framework
from multiple tasks. Since the neural network is shared across different tasks,
we can gather knowledge on BO from multiple training tasks, and use the
knowledge for unseen test tasks. In experiments using three text document
datasets, we demonstrate that the proposed method achieves better BO
performance than the existing methods.
Related papers
- Deep Learning and genetic algorithms for cosmological Bayesian inference speed-up [0.0]
We present a novel approach to accelerate the Bayesian inference process, focusing specifically on the nested sampling algorithms.
Our proposed method utilizes the power of deep learning, employing feedforward neural networks to approximate the likelihood function dynamically during the Bayesian inference process.
The implementation integrates with nested sampling algorithms and has been thoroughly evaluated using both simple cosmological dark energy models and diverse observational datasets.
arXiv Detail & Related papers (2024-05-06T09:14:58Z) - Deep Neural Network Classifier for Multi-dimensional Functional Data [4.340040784481499]
We propose a new approach, called as functional deep neural network (FDNN), for classifying multi-dimensional functional data.
Specifically, a deep neural network is trained based on the principle components of the training data which shall be used to predict the class label of a future data function.
arXiv Detail & Related papers (2022-05-17T19:22:48Z) - Incorporating Prior Knowledge into Neural Networks through an Implicit
Composite Kernel [1.6383321867266318]
Implicit Composite Kernel (ICK) is a kernel that combines a kernel implicitly defined by a neural network with a second kernel function chosen to model known properties.
We demonstrate ICK's superior performance and flexibility on both synthetic and real-world data sets.
arXiv Detail & Related papers (2022-05-15T21:32:44Z) - Neural Capacitance: A New Perspective of Neural Network Selection via
Edge Dynamics [85.31710759801705]
Current practice requires expensive computational costs in model training for performance prediction.
We propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training.
Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections.
arXiv Detail & Related papers (2022-01-11T20:53:15Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z) - Universality and Optimality of Structured Deep Kernel Networks [0.0]
Kernel based methods yield approximation models that are flexible, efficient and powerful.
Recent success of machine learning methods has been driven by deep neural networks (NNs)
In this paper, we show that the use of special types of kernels yield models reminiscent of neural networks.
arXiv Detail & Related papers (2021-05-15T14:10:35Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Solving Mixed Integer Programs Using Neural Networks [57.683491412480635]
This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one.
Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP.
We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each.
arXiv Detail & Related papers (2020-12-23T09:33:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.