Related papers: How to Evaluate Solutions in Pareto-based Search-Based Software Engineering? A Critical Review and Methodological Guidance

How to Evaluate Solutions in Pareto-based Search-Based Software Engineering? A Critical Review and Methodological Guidance

URL: http://arxiv.org/abs/2002.09040v4
Date: Sat, 28 Nov 2020 13:23:40 GMT
Title: How to Evaluate Solutions in Pareto-based Search-Based Software Engineering? A Critical Review and Methodological Guidance
Authors: Miqing Li and Tao Chen and Xin Yao
Abstract summary: This paper reviews studies on quality evaluation for multi-objective optimization in Search-Based SE. We conduct an in-depth analysis of quality evaluation indicators/methods and general situations in SBSE. We codify a methodological guidance for selecting and using evaluation methods in different SBSE scenarios.
Score: 9.040916182677963
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With modern requirements, there is an increasing tendency of considering multiple objectives/criteria simultaneously in many Software Engineering (SE) scenarios. Such a multi-objective optimization scenario comes with an important issue -- how to evaluate the outcome of optimization algorithms, which typically is a set of incomparable solutions (i.e., being Pareto non-dominated to each other). This issue can be challenging for the SE community, particularly for practitioners of Search-Based SE (SBSE). On one hand, multi-objective optimization could still be relatively new to SE/SBSE researchers, who may not be able to identify the right evaluation methods for their problems. On the other hand, simply following the evaluation methods for general multi-objective optimization problems may not be appropriate for specific SE problems, especially when the problem nature or decision maker's preferences are explicitly/implicitly available. This has been well echoed in the literature by various inappropriate/inadequate selection and inaccurate/misleading use of evaluation methods. In this paper, we first carry out a systematic and critical review of quality evaluation for multi-objective optimization in SBSE. We survey 717 papers published between 2009 and 2019 from 36 venues in seven repositories, and select 95 prominent studies, through which we identify five important but overlooked issues in the area. We then conduct an in-depth analysis of quality evaluation indicators/methods and general situations in SBSE, which, together with the identified issues, enables us to codify a methodological guidance for selecting and using evaluation methods in different SBSE scenarios.

Related papers

Pareto Optimal Algorithmic Recourse in Multi-cost Function [0.44938884406455726]
algorithmic recourse aims to identify minimal-cost actions to alter an individual features, thereby obtaining a desired outcome. Most current recourse mechanisms use gradient-based methods that assume cost functions are differentiable, often not applicable in real-world scenarios. This work proposes an algorithmic recourse framework that handles nondifferentiable and discrete multi-cost functions.
arXiv Detail & Related papers (2025-02-11T03:16:08Z)
Illuminating the Diversity-Fitness Trade-Off in Black-Box Optimization [9.838618121102053]
In real-world applications, users often favor structurally diverse design choices over one high-quality solution. This paper presents a fresh perspective on this challenge by considering the problem of identifying a fixed number of solutions with a pairwise distance above a specified threshold.
arXiv Detail & Related papers (2024-08-29T09:55:55Z)
On Speeding Up Language Model Evaluation [48.51924035873411]
Development of prompt-based methods with Large Language Models (LLMs) requires making numerous decisions. We propose a novel method to address this challenge. We show that it can identify the top-performing method using only 5-15% of the typically needed resources.
arXiv Detail & Related papers (2024-07-08T17:48:42Z)
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making. We present a process-based benchmark MR-Ben that demands a meta-reasoning skill. Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z)
Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions [48.251724997889184]
We develop a benchmark called Problems with Missing and Contradictory conditions (PMC) We introduce two novel metrics to evaluate the performance of few-shot prompting methods in these scenarios. We propose a novel few-shot prompting method called SMT-LIB Prompting (SLP), which utilizes the SMT-LIB language to model the problems instead of solving them directly.
arXiv Detail & Related papers (2024-06-07T16:24:12Z)
An Efficient Approach for Solving Expensive Constrained Multiobjective Optimization Problems [0.0]
An efficient probabilistic selection based constrained multi-objective EA is proposed, referred to as PSCMOEA. It comprises novel elements such as (a) an adaptive search bound identification scheme based on the feasibility and convergence status of evaluated solutions. Numerical experiments are conducted on an extensive range of challenging constrained problems using low evaluation budgets to simulate ECMOPs.
arXiv Detail & Related papers (2024-05-22T02:32:58Z)
Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric. We propose a single framework built on their equivalence in learning scenarios. Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z)
Evaluating Mathematical Reasoning Beyond Accuracy [50.09931172314218]
We introduce ReasonEval, a new methodology for evaluating the quality of reasoning steps. We show that ReasonEval consistently outperforms baseline methods in the meta-evaluation datasets. We observe that ReasonEval can play a significant role in data selection.
arXiv Detail & Related papers (2024-04-08T17:18:04Z)
Multiobjective Optimization Analysis for Finding Infrastructure-as-Code Deployment Configurations [0.3774866290142281]
This paper is focused on a multiobjective problem related to Infrastructure-as-Code deployment configurations. We resort in this paper to nine different evolutionary-based multiobjective algorithms. Results obtained by each method after 10 independent runs have been compared using Friedman's non-parametric tests.
arXiv Detail & Related papers (2024-01-18T13:55:32Z)
In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation [92.51773744318119]
This paper empirically investigates the strengths and weaknesses of different model selection criteria. We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them.
arXiv Detail & Related papers (2023-02-06T16:55:37Z)
On solving decision and risk management problems subject to uncertainty [91.3755431537592]
Uncertainty is a pervasive challenge in decision and risk management. This paper develops a systematic understanding of such strategies, determine their range of application, and develop a framework to better employ them.
arXiv Detail & Related papers (2023-01-18T19:16:23Z)
ARES: An Efficient Algorithm with Recurrent Evaluation and Sampling-Driven Inference for Maximum Independent Set [48.57120672468062]
This paper introduces an efficient algorithm for the Maximum Independent Set (MIS) problem, incorporating two innovative techniques. The proposed algorithm outperforms state-of-the-art algorithms in terms of solution quality, computational efficiency, and stability.
arXiv Detail & Related papers (2022-08-16T14:39:38Z)
Uncertainty-Aware Search Framework for Multi-Objective Bayesian Optimization [40.40632890861706]
We consider the problem of multi-objective (MO) blackbox optimization using expensive function evaluations. We propose a novel uncertainty-aware search framework referred to as USeMO to efficiently select the sequence of inputs for evaluation.
arXiv Detail & Related papers (2022-04-12T16:50:48Z)
Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems. We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z)
Robust Active Preference Elicitation [10.961537256186498]
We study the problem of eliciting the preferences of a decision-maker through a moderate number of pairwise comparison queries. We are motivated by applications in high stakes domains, such as when choosing a policy for allocating scarce resources.
arXiv Detail & Related papers (2020-03-04T05:24:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.