BINGO! Simple Optimizers Win Big if Problems Collapse to a Few Buckets
- URL: http://arxiv.org/abs/2506.04509v1
- Date: Wed, 04 Jun 2025 23:13:58 GMT
- Title: BINGO! Simple Optimizers Win Big if Problems Collapse to a Few Buckets
- Authors: Kishan Kumar Ganguly, Tim Menzies,
- Abstract summary: Software engineering (SE) can be slow and complex.<n>This paper introduces the BINGO effect: a novel phenomenon where SE data collapses into a tiny fraction of possible solution "buckets"<n>We show the BINGO effect's prevalence across 39 optimization in SE problems.<n>Exploiting this, we optimize 10,000 times faster than state-of-the-art methods, with comparable effectiveness.
- Score: 12.069139819456861
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional multi-objective optimization in software engineering (SE) can be slow and complex. This paper introduces the BINGO effect: a novel phenomenon where SE data surprisingly collapses into a tiny fraction of possible solution "buckets" (e.g., only 100 used from 4,096 expected). We show the BINGO effect's prevalence across 39 optimization in SE problems. Exploiting this, we optimize 10,000 times faster than state-of-the-art methods, with comparable effectiveness. Our new algorithms (LITE and LINE), demonstrate that simple stochastic selection can match complex optimizers like DEHB. This work explains why simple methods succeed in SE-real data occupies a small corner of possibilities-and guides when to apply them, challenging the need for CPU-heavy optimization. Our data and code are public at GitHub (see anon-artifacts/bingo).
Related papers
- How Low Can You Go? The Data-Light SE Challenge [4.282746516699565]
Much of software engineering (SE) research assumes that progress depends on massive datasets and CPU-intensives.<n>Counter-evidence presented in this paper suggests otherwise, including software configuration and performance tuning, cloud and systems optimization, project and process-level decision modeling, behavioral analytics, financial risk modeling, project health prediction, reinforcement learning tasks, sales forecasting, and software testing.<n>Our results highlight that some SE tasks may be better served by lightweight approaches that demand fewer labels and far less computation.
arXiv Detail & Related papers (2025-12-15T16:49:50Z) - Large Language Models as Optimizers [106.52386531624532]
We propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as prompts.
In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values.
We demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.
arXiv Detail & Related papers (2023-09-07T00:07:15Z) - Large-Batch, Iteration-Efficient Neural Bayesian Design Optimization [37.339567743948955]
We present a novel Bayesian optimization framework specifically tailored to address the limitations of BO.
Our key contribution is a highly scalable, sample-based acquisition function that performs a non-dominated sorting of objectives.
We show that our acquisition function in combination with different Bayesian neural network surrogates is effective in data-intensive environments with a minimal number of iterations.
arXiv Detail & Related papers (2023-06-01T19:10:57Z) - VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles.
We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates.
We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z) - Fast Bayesian Optimization of Needle-in-a-Haystack Problems using
Zooming Memory-Based Initialization [73.96101108943986]
A Needle-in-a-Haystack problem arises when there is an extreme imbalance of optimum conditions relative to the size of the dataset.
We present a Zooming Memory-Based Initialization algorithm that builds on conventional Bayesian optimization principles.
arXiv Detail & Related papers (2022-08-26T23:57:41Z) - Bayesian Algorithm Execution: Estimating Computable Properties of
Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations.
We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output.
On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Mixing ADAM and SGD: a Combined Optimization Method [0.9569316316728905]
We propose a new type of optimization method called MAS (Mixing ADAM and SGD)
Rather than trying to improve SGD or ADAM we exploit both at the same time by taking the best of both.
We have conducted several experiments on images and text document classification, using various CNNs, and we demonstrated by experiments that the proposed MAS produces better performance than the single SGD or ADAMs.
arXiv Detail & Related papers (2020-11-16T15:48:38Z) - Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems.
Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections.
Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z) - Gumbel-softmax-based Optimization: A Simple General Framework for
Optimization Problems on Graphs [5.486093983007419]
We propose a simple, fast, and general algorithm framework based on advanced automatic differentiation technique empowered by deep learning frameworks.
High-quality solutions can be obtained with much less time consuming compared to traditional approaches.
arXiv Detail & Related papers (2020-04-14T14:11:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.