Related papers: Decreasing the Computing Time of Bayesian Optimization using Generalizable Memory Pruning

Decreasing the Computing Time of Bayesian Optimization using Generalizable Memory Pruning

URL: http://arxiv.org/abs/2309.04510v1
Date: Fri, 8 Sep 2023 14:05:56 GMT
Title: Decreasing the Computing Time of Bayesian Optimization using Generalizable Memory Pruning
Authors: Alexander E. Siemenn, Tonio Buonassisi
Abstract summary: We show a wrapper of memory pruning and bounded optimization capable of being used with any surrogate model and acquisition function. Running BO on high-dimensional or massive data sets becomes intractable due to this time complexity. All model implementations are run on the MIT Supercloud state-of-the-art computing hardware.
Score: 56.334116591082896
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian optimization (BO) suffers from long computing times when processing highly-dimensional or large data sets. These long computing times are a result of the Gaussian process surrogate model having a polynomial time complexity with the number of experiments. Running BO on high-dimensional or massive data sets becomes intractable due to this time complexity scaling, in turn, hindering experimentation. Alternative surrogate models have been developed to reduce the computing utilization of the BO procedure, however, these methods require mathematical alteration of the inherit surrogate function, pigeonholing use into only that function. In this paper, we demonstrate a generalizable BO wrapper of memory pruning and bounded optimization, capable of being used with any surrogate model and acquisition function. Using this memory pruning approach, we show a decrease in wall-clock computing times per experiment of BO from a polynomially increasing pattern to a sawtooth pattern that has a non-increasing trend without sacrificing convergence performance. Furthermore, we illustrate the generalizability of the approach across two unique data sets, two unique surrogate models, and four unique acquisition functions. All model implementations are run on the MIT Supercloud state-of-the-art computing hardware.

Related papers

Scaling Probabilistic Circuits via Monarch Matrices [109.65822339230853]
Probabilistic Circuits (PCs) are tractable representations of probability distributions.<n>We propose a novel sparse and structured parameterization for the sum blocks in PCs.
arXiv Detail & Related papers (2025-06-14T07:39:15Z)
Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding [64.2888389315149]
Test-time scaling improves large language model performance by adding extra compute during decoding. Best-of-N sampling serves as a common scaling technique, broadening the search space for finding better solutions. We propose Self-Truncation Best-of-N (ST-BoN), a novel decoding method that avoids fully generating all samplings.
arXiv Detail & Related papers (2025-03-03T11:21:01Z)
PABBO: Preferential Amortized Black-Box Optimization [24.019185659134294]
Preferential Bayesian Optimization (PBO) is a sample-efficient method to learn latent user utilities from preferential feedback over a pair of designs. We propose to circumvent this issue by fully amortizing PBO, meta-learning both the surrogate and the acquisition function. Our method is several orders of magnitude faster than the usual Gaussian process-based strategies and often outperforms them in accuracy.
arXiv Detail & Related papers (2025-03-02T14:57:24Z)
Optimizing VarLiNGAM for Scalable and Efficient Time Series Causal Discovery [5.430532390358285]
Causal discovery is designed to identify causal relationships in data. Time series causal discovery is particularly challenging due to the need to account for temporal dependencies and potential time lag effects. This study significantly improves the feasibility of processing large datasets.
arXiv Detail & Related papers (2024-09-09T10:52:58Z)
SCORE: A 1D Reparameterization Technique to Break Bayesian Optimization's Curse of Dimensionality [0.0]
A 1D reparametrization trick is proposed to break this curse and sustain linear time complexity for BO in high-dimensional landscapes. This fast and scalable approach named SCORE can successfully find the global minimum of needle-in-a-haystack optimization functions.
arXiv Detail & Related papers (2024-06-18T14:28:29Z)
Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching [53.91395791840179]
We present Unified Spectral Bundling with Sketching (USBS), a provably correct, fast and scalable algorithm for solving massive SDPs. USBS provides a 500x speed-up over the state-of-the-art scalable SDP solver on an instance with over 2 billion decision variables.
arXiv Detail & Related papers (2023-12-19T02:27:22Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity [110.50364486645852]
Communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers. We propose Common Om REOm, which can be used to compress information transmitted between machines.
arXiv Detail & Related papers (2023-09-23T08:45:27Z)
Eva: A General Vectorized Approximation Framework for Second-order Optimization [16.647611352181574]
We present a memory- and time-efficient second-order algorithm named Eva with two novel techniques. We derive an efficient update formula without explicitly computing the inverse of using the Sherman-Morrison formula. Experiments show that Eva reduces the end-to-end training time up to 2.05x and 2.42x compared to first-order SGD and second-order algorithms.
arXiv Detail & Related papers (2023-08-04T03:51:38Z)
Predictive Modeling through Hyper-Bayesian Optimization [60.586813904500595]
We propose a novel way of integrating model selection and BO for the single goal of reaching the function optima faster. The algorithm moves back and forth between BO in the model space and BO in the function space, where the goodness of the recommended model is captured. In addition to improved sample efficiency, the framework outputs information about the black-box function.
arXiv Detail & Related papers (2023-08-01T04:46:58Z)
An efficient aggregation method for the symbolic representation of temporal data [0.0]
We present a new variant of the adaptive Brownian bridge-based aggregation (ABBA) method, called fABBA. This variant utilizes a new aggregation approach tailored to the piecewise representation of time series. In contrast to the original method, the new approach does not require the number of time series symbols to be specified in advance.
arXiv Detail & Related papers (2022-01-14T22:51:24Z)
Bayesian Optimization over Permutation Spaces [30.650753803587794]
We propose and evaluate two algorithms for BO over Permutation Spaces (BOPS) We theoretically analyze the performance of BOPS-T to show that their regret grows sub-linearly. Our experiments on multiple synthetic and real-world benchmarks show that both BOPS-T and BOPS-H perform better than the state-of-the-art BO algorithm for spaces.
arXiv Detail & Related papers (2021-12-02T08:20:50Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.