Related papers: Benchmarking in Optimization: Best Practice and Open Issues

Benchmarking in Optimization: Best Practice and Open Issues

URL: http://arxiv.org/abs/2007.03488v2
Date: Wed, 16 Dec 2020 22:36:27 GMT
Title: Benchmarking in Optimization: Best Practice and Open Issues
Authors: Thomas Bartz-Beielstein, Carola Doerr, Daan van den Berg, Jakob Bossek, Sowmya Chandrasekaran, Tome Eftimov, Andreas Fischbach, Pascal Kerschke, William La Cava, Manuel Lopez-Ibanez, Katherine M. Malan, Jason H. Moore, Boris Naujoks, Patryk Orzechowski, Vanessa Volz, Markus Wagner, Thomas Weise
Abstract summary: This survey compiles ideas and recommendations from more than a dozen researchers with different backgrounds and from different institutes around the world. The article discusses eight essential topics in benchmarking: clearly stated goals, well-specified problems, suitable algorithms, adequate performance measures, thoughtful analysis, effective and efficient designs, comprehensible presentations, and guaranteed.
Score: 9.710173903804373
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This survey compiles ideas and recommendations from more than a dozen researchers with different backgrounds and from different institutes around the world. Promoting best practice in benchmarking is its main goal. The article discusses eight essential topics in benchmarking: clearly stated goals, well-specified problems, suitable algorithms, adequate performance measures, thoughtful analysis, effective and efficient designs, comprehensible presentations, and guaranteed reproducibility. The final goal is to provide well-accepted guidelines (rules) that might be useful for authors and reviewers. As benchmarking in optimization is an active and evolving field of research this manuscript is meant to co-evolve over time by means of periodic updates.

Related papers

Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
A Survey of Direct Preference Optimization [103.59317151002693]
Large Language Models (LLMs) have demonstrated unprecedented generative capabilities. Their alignment with human values remains critical for ensuring helpful and harmless deployments. Direct Preference Optimization (DPO) has recently gained prominence as a streamlined alternative.
arXiv Detail & Related papers (2025-03-12T08:45:15Z)
Practical Topics in Optimization [8.034728173797953]
optimization plays a foundational role in advancing fields such as mathematics, computer science, operations research, machine learning, and beyond. From refining machine learning models to improving resource allocation and designing efficient algorithms, optimization techniques serve as essential tools for tackling complex problems. This book aims to provide both an introductory guide and a comprehensive reference, equipping readers with the necessary knowledge to understand and apply optimization methods within their respective fields.
arXiv Detail & Related papers (2025-02-16T10:00:50Z)
MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler [0.7418044931036347]
We propose a new software tool which uses principles from unbounded archiving as a logging structure. This leads to a clearer separation between experimental design and subsequent analysis decisions.
arXiv Detail & Related papers (2024-12-10T12:00:53Z)
Optimal or Greedy Decision Trees? Revisiting their Objectives, Tuning, and Performance [9.274054218991528]
Recently there has been a surge of interest in optimal decision tree (ODT) methods that globally optimize accuracy directly. We identify two relatively unexplored aspects of ODTs: the objective function used in training trees and tuning techniques. The value of optimal methods is not well understood yet, as the literature provides conflicting results.
arXiv Detail & Related papers (2024-09-19T13:55:29Z)
DOCE: Finding the Sweet Spot for Execution-Based Code Generation [69.5305729627198]
We propose a comprehensive framework that includes candidate generation, $n$-best reranking, minimum Bayes risk (MBR) decoding, and self-ging as the core components. Our findings highlight the importance of execution-based methods and the difference gap between execution-based and execution-free methods.
arXiv Detail & Related papers (2024-08-25T07:10:36Z)
Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models [95.96734086126469]
Large language models (LLMs) can serve as the assistant to help users accomplish their jobs, and also support the development of advanced applications. For the wide application of LLMs, the inference efficiency is an essential concern, which has been widely studied in existing work. We perform a detailed coarse-to-fine analysis of the inference performance of various code libraries.
arXiv Detail & Related papers (2024-04-17T15:57:50Z)
Localized Zeroth-Order Prompt Optimization [54.964765668688806]
We propose a novel algorithm, namely localized zeroth-order prompt optimization (ZOPO) ZOPO incorporates a Neural Tangent Kernel-based derived Gaussian process into standard zeroth-order optimization for an efficient search of well-performing local optima in prompt optimization. Remarkably, ZOPO outperforms existing baselines in terms of both the optimization performance and the query efficiency.
arXiv Detail & Related papers (2024-03-05T14:18:15Z)
A Practical Survey on Zero-shot Prompt Design for In-context Learning [0.0]
Large language models (LLMs) have brought about significant improvements in Natural Language Processing(NLP) tasks. This paper presents a comprehensive review of in-context learning techniques, focusing on different types of prompts. We explore various approaches to prompt design, such as manual design, optimization algorithms, and evaluation methods.
arXiv Detail & Related papers (2023-09-22T23:00:34Z)
Robust Prompt Optimization for Large Language Models Against Distribution Shifts [80.6757997074956]
Large Language Model (LLM) has demonstrated significant ability in various Natural Language Processing tasks. We propose a new problem of robust prompt optimization for LLMs against distribution shifts. This problem requires the prompt optimized over the labeled source group can simultaneously generalize to an unlabeled target group.
arXiv Detail & Related papers (2023-05-23T11:30:43Z)
Benchopt: Reproducible, efficient and collaborative optimization benchmarks [67.29240500171532]
Benchopt is a framework to automate, reproduce and publish optimization benchmarks in machine learning. Benchopt simplifies benchmarking for the community by providing an off-the-shelf tool for running, sharing and extending experiments.
arXiv Detail & Related papers (2022-06-27T16:19:24Z)
A Field Guide to Federated Optimization [161.3779046812383]
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data. This paper provides recommendations and guidelines on formulating, designing, evaluating and analyzing federated optimization algorithms.
arXiv Detail & Related papers (2021-07-14T18:09:08Z)
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers [29.624308090226375]
In this work, we aim to replace these anecdotes, if not with a conclusive ranking, then at least with evidence-backed anecdotes. To do so, we perform an extensive, standardized benchmark of fifteen particularly popular deep learnings. Our open-sourced results are available as challenging and well-tuned baselines for more meaningful evaluations of novel optimization methods.
arXiv Detail & Related papers (2020-07-03T08:19:36Z)
Benchmarking for Metaheuristic Black-Box Optimization: Perspectives and Open Challenges [0.0]
Research on new optimization algorithms is often funded based on the motivation that such algorithms might improve the capabilities to deal with real-world and industrially relevant challenges. A large number of test problems and benchmark suites have been developed and used for comparative assessments of algorithms.
arXiv Detail & Related papers (2020-07-01T15:09:40Z)
A Prescription of Methodological Guidelines for Comparing Bio-inspired Optimization Algorithms [15.803264424018488]
We propose methodological guidelines to prepare a successful proposal of a new bio-inspired algorithm. Results reported by the authors should be proven to achieve a significant advance over previous outcomes. We expect these guidelines to be useful not only for authors, but also for reviewers and editors along their assessment of new contributions to the field.
arXiv Detail & Related papers (2020-04-19T04:46:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.