Related papers: FlexiBO: A Decoupled Cost-Aware Multi-Objective Optimization Approach for Deep Neural Networks

FlexiBO: A Decoupled Cost-Aware Multi-Objective Optimization Approach for Deep Neural Networks

URL: http://arxiv.org/abs/2001.06588v3
Date: Sun, 21 Aug 2022 16:32:29 GMT
Title: FlexiBO: A Decoupled Cost-Aware Multi-Objective Optimization Approach for Deep Neural Networks
Authors: Md Shahriar Iqbal, Jianhai Su, Lars Kotthoff, Pooyan Jamshidi
Abstract summary: We develop a novel multi-objective optimization algorithm, we call Flexible Multi-Objective Bayesian Optimization (FlexiBO) to address this issue. We evaluate FlexiBO on seven state-of-the-art DNNs for image recognition, natural language processing (NLP), and speech-to-text translation.
Score: 4.596221278839825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The design of machine learning systems often requires trading off different objectives, for example, prediction error and energy consumption for deep neural networks (DNNs). Typically, no single design performs well in all objectives; therefore, finding Pareto-optimal designs is of interest. The search for Pareto-optimal designs involves evaluating designs in an iterative process, and the measurements are used to evaluate an acquisition function that guides the search process. However, measuring different objectives incurs different costs. For example, the cost of measuring the prediction error of DNNs is orders of magnitude higher than that of measuring the energy consumption of a pre-trained DNN, as it requires re-training the DNN. Current state-of-the-art methods do not consider this difference in objective evaluation cost, potentially incurring expensive evaluations of objective functions in the optimization process. In this paper, we develop a novel decoupled and cost-aware multi-objective optimization algorithm, we call Flexible Multi-Objective Bayesian Optimization (FlexiBO) to address this issue. FlexiBO weights the improvement of the hypervolume of the Pareto region by the measurement cost of each objective to balance the expense of collecting new information with the knowledge gained through objective evaluations, preventing us from performing expensive measurements for little to no gain. We evaluate FlexiBO on seven state-of-the-art DNNs for image recognition, natural language processing (NLP), and speech-to-text translation. Our results indicate that, given the same total experimental budget, FlexiBO discovers designs with 4.8$\%$ to 12.4$\%$ lower hypervolume error than the best method in state-of-the-art multi-objective optimization.

Related papers

Cost-aware Stopping for Bayesian Optimization [53.34052774820105]
We propose a cost-aware stopping rule for Bayesian optimization that adapts to varying evaluation costs and is free of tuning.<n>We prove a theoretical guarantee bounding the expected cumulative evaluation cost incurred by our stopping rule when paired with state-of-the-art acquisition functions.
arXiv Detail & Related papers (2025-07-16T17:54:14Z)
Optimizers Qualitatively Alter Solutions And We Should Leverage This [62.662640460717476]
Deep Neural Networks (DNNs) can not guarantee convergence to a unique global minimum of the loss when using only local information, such as SGD.<n>We argue that the community should aim at understanding the biases of already existing methods, as well as aim to build new DNNs with the explicit intent of inducing certain properties of the solution.
arXiv Detail & Related papers (2025-07-16T13:33:31Z)
Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing [53.77822620185878]
We propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs. We develop "BayesMulti", a training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections. Our integrated approach enables use of analog computing in much deeper and wider networks, achieving up to 100-fold improvements.
arXiv Detail & Related papers (2024-12-03T19:20:08Z)
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost. We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion. By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z)
Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation [55.75188191403343]
We introduce utility, which is a function predefined by each user and describes the trade-off between cost and performance of BO. We validate our algorithm on various LC datasets and found it outperform all the previous multi-fidelity BO and transfer-BO baselines we consider.
arXiv Detail & Related papers (2024-05-28T07:38:39Z)
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality. This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives. It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z)
Design Amortization for Bayesian Optimal Experimental Design [70.13948372218849]
We build off of successful variational approaches, which optimize a parameterized variational model with respect to bounds on the expected information gain (EIG) We present a novel neural architecture that allows experimenters to optimize a single variational model that can estimate the EIG for potentially infinitely many designs.
arXiv Detail & Related papers (2022-10-07T02:12:34Z)
Bayesian Optimization Over Iterative Learners with Structured Responses: A Budget-aware Planning Approach [31.918476422203412]
This paper proposes a novel approach referred to as Budget-Aware Planning for Iterative learners (BAPI) to solve HPO problems under a constrained cost budget. Experiments on diverse HPO benchmarks for iterative learners show that BAPI performs better than state-of-the-art baselines in most of the cases.
arXiv Detail & Related papers (2022-06-25T18:44:06Z)
Cost-Effective Federated Learning in Mobile Edge Networks [37.16466118235272]
Federated learning (FL) is a distributed learning paradigm that enables a large number of mobile devices to collaboratively learn a model without sharing their raw data. We analyze how to design adaptive FL in mobile edge networks that optimally chooses essential control variables to minimize the total cost. We develop a low-cost sampling-based algorithm to learn the convergence related unknown parameters.
arXiv Detail & Related papers (2021-09-12T03:02:24Z)
Conservative Objective Models for Effective Offline Model-Based Optimization [78.19085445065845]
Computational design problems arise in a number of settings, from synthetic biology to computer architectures. We propose a method that learns a model of the objective function that lower bounds the actual value of the ground-truth objective on out-of-distribution inputs. COMs are simple to implement and outperform a number of existing methods on a wide range of MBO problems.
arXiv Detail & Related papers (2021-07-14T17:55:28Z)
Multi-Fidelity Multi-Objective Bayesian Optimization: An Output Space Entropy Search Approach [44.25245545568633]
We study the novel problem of blackbox optimization of multiple objectives via multi-fidelity function evaluations. Our experiments on several synthetic and real-world benchmark problems show that MF-OSEMO, with both approximations, significantly improves over the state-of-the-art single-fidelity algorithms.
arXiv Detail & Related papers (2020-11-02T06:59:04Z)
Information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations [44.25245545568633]
We propose information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations (iMOCA) to solve this problem. Our experiments on diverse synthetic and real-world benchmarks show that iMOCA significantly improves over existing single-fidelity methods.
arXiv Detail & Related papers (2020-09-12T01:46:03Z)
Multi-Fidelity Bayesian Optimization via Deep Neural Networks [19.699020509495437]
In many applications, the objective function can be evaluated at multiple fidelities to enable a trade-off between the cost and accuracy. We propose Deep Neural Network Multi-Fidelity Bayesian Optimization (DNN-MFBO) that can flexibly capture all kinds of complicated relationships between the fidelities. We show the advantages of our method in both synthetic benchmark datasets and real-world applications in engineering design.
arXiv Detail & Related papers (2020-07-06T23:28:40Z)
Self-Directed Online Machine Learning for Topology Optimization [58.920693413667216]
Self-directed Online Learning Optimization integrates Deep Neural Network (DNN) with Finite Element Method (FEM) calculations. Our algorithm was tested by four types of problems including compliance minimization, fluid-structure optimization, heat transfer enhancement and truss optimization. It reduced the computational time by 2 5 orders of magnitude compared with directly using methods, and outperformed all state-of-the-art algorithms tested in our experiments.
arXiv Detail & Related papers (2020-02-04T20:00:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.