Related papers: Data-driven Algorithm Design

Data-driven Algorithm Design

URL: http://arxiv.org/abs/2011.07177v1
Date: Sat, 14 Nov 2020 00:51:57 GMT
Title: Data-driven Algorithm Design
Authors: Maria-Florina Balcan
Abstract summary: Data driven algorithm design is an important aspect of modern data science and algorithm design. A small tweak to the parameters can cause a cascade of changes in the algorithm's behavior. We provide strong computational and statistical performance guarantees for batch and online scenarios.
Score: 21.39493074700162
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Data driven algorithm design is an important aspect of modern data science and algorithm design. Rather than using off the shelf algorithms that only have worst case performance guarantees, practitioners often optimize over large families of parametrized algorithms and tune the parameters of these algorithms using a training set of problem instances from their domain to determine a configuration with high expected performance over future instances. However, most of this work comes with no performance guarantees. The challenge is that for many combinatorial problems of significant importance including partitioning, subset selection, and alignment problems, a small tweak to the parameters can cause a cascade of changes in the algorithm's behavior, so the algorithm's performance is a discontinuous function of its parameters. In this chapter, we survey recent work that helps put data-driven combinatorial algorithm design on firm foundations. We provide strong computational and statistical performance guarantees, both for the batch and online scenarios where a collection of typical problem instances from the given application are presented either all at once or in an online fashion, respectively.

Related papers

Feature-based Evolutionary Diversity Optimization of Discriminating Instances for Chance-constrained Optimization Problems [9.617143859697322]
We evolve benchmarking instances for chance-constrained optimization problems that contain components characterized by their expected values and variances. Our method successfully generates diverse instances based on different features while effectively distinguishing the performance between a pair of algorithms.
arXiv Detail & Related papers (2025-01-24T06:55:54Z)
Benchmarking Randomized Optimization Algorithms on Binary, Permutation, and Combinatorial Problem Landscapes [8.337399973715396]
We evaluate the performance of four randomized optimization algorithms: Randomized Hill Climbing (RHC), Simulated Annealing (SA), Genetic Algorithms (GA), and MIMIC. We compare these algorithms using a set of benchmark fitness functions that highlight the specific challenges and requirements of each problem category. Our study analyzes each algorithm's effectiveness based on key performance metrics, including solution quality, convergence speed, computational cost, and robustness.
arXiv Detail & Related papers (2025-01-21T23:13:01Z)
Equitable and Fair Performance Evaluation of Whale Optimization Algorithm [4.0814527055582746]
It is essential that all algorithms are exhaustively, somewhat, and intelligently evaluated. evaluating the effectiveness of optimization algorithms equitably and fairly is not an easy process for various reasons.
arXiv Detail & Related papers (2023-09-04T06:32:02Z)
Minimalistic Predictions to Schedule Jobs with Online Precedence Constraints [117.8317521974783]
We consider non-clairvoyant scheduling with online precedence constraints. An algorithm is oblivious to any job dependencies and learns about a job only if all of its predecessors have been completed.
arXiv Detail & Related papers (2023-01-30T13:17:15Z)
Accelerating ERM for data-driven algorithm design using output-sensitive techniques [26.32088674030797]
We study techniques to develop efficient learning algorithms for data-driven algorithm design. Our approach involves two novel ingredients -- an output-sensitive algorithm for enumerating polytopes induced by a set of hyperplanes. We illustrate our techniques by giving algorithms for pricing problems, linkage-based clustering and dynamic-programming based sequence alignment.
arXiv Detail & Related papers (2022-04-07T17:27:18Z)
Learning to Hash Robustly, with Guarantees [79.68057056103014]
In this paper, we design an NNS algorithm for the Hamming space that has worst-case guarantees essentially matching that of theoretical algorithms. We evaluate the algorithm's ability to optimize for a given dataset both theoretically and practically. Our algorithm has a 1.8x and 2.1x better recall on the worst-performing queries to the MNIST and ImageNet datasets.
arXiv Detail & Related papers (2021-08-11T20:21:30Z)
A Two-stage Framework and Reinforcement Learning-based Optimization Algorithms for Complex Scheduling Problems [54.61091936472494]
We develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together. The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively. Results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems.
arXiv Detail & Related papers (2021-03-10T03:16:12Z)
Leveraging Benchmarking Data for Informed One-Shot Dynamic Algorithm Selection [0.9281671380673306]
A key challenge in the application of evolutionary algorithms is the selection of an algorithm instance that best suits the problem at hand. We analyze in this work how such prior performance data can be used to infer informed dynamic algorithm selection schemes for the solution of pseudo-Boolean optimization problems.
arXiv Detail & Related papers (2021-02-12T12:27:02Z)
Particle Swarm Optimization: Fundamental Study and its Application to Optimization and to Jetty Scheduling Problems [0.0]
The advantages of evolutionary algorithms with respect to traditional methods have been greatly discussed in the literature. While particle swarms share such advantages, they outperform evolutionary algorithms in that they require lower computational cost and easier implementation. This paper does not intend to study their tuning, general-purpose settings are taken from previous studies, and virtually the same algorithm is used to optimize a variety of notably different problems.
arXiv Detail & Related papers (2021-01-25T02:06:30Z)
Benchmarking Simulation-Based Inference [5.3898004059026325]
Recent advances in probabilistic modelling have led to a large number of simulation-based inference algorithms which do not require numerical evaluation of likelihoods. We provide a benchmark with inference tasks and suitable performance metrics, with an initial selection of algorithms. We found that the choice of performance metric is critical, that even state-of-the-art algorithms have substantial room for improvement, and that sequential estimation improves sample efficiency.
arXiv Detail & Related papers (2021-01-12T18:31:22Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Refined bounds for algorithm configuration: The knife-edge of dual class approximability [94.83809668933021]
We investigate how large should a training set be to ensure that a parameter's average metrics performance over the training set is close to its expected, future performance. We show that if this approximation holds under the L-infinity norm, we can provide strong sample complexity bounds. We empirically evaluate our bounds in the context of integer programming, one of the most powerful tools in computer science.
arXiv Detail & Related papers (2020-06-21T15:32:21Z)
Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms. We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.