Exact Reformulation and Optimization for Direct Metric Optimization in Binary Imbalanced Classification
- URL: http://arxiv.org/abs/2507.15240v1
- Date: Mon, 21 Jul 2025 04:52:51 GMT
- Title: Exact Reformulation and Optimization for Direct Metric Optimization in Binary Imbalanced Classification
- Authors: Le Peng, Yash Travadi, Chuan He, Ying Cui, Ju Sun,
- Abstract summary: We study two key classification metrics, precision and recall, under three practical binary IC settings.<n>Unlike existing methods that rely on smooth approximations to deal with the indicator function involved, textitwe introduce, for the first time, exact constrained reformulations for these direct metric optimization (DMO) problems.
- Score: 10.140989590842224
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For classification with imbalanced class frequencies, i.e., imbalanced classification (IC), standard accuracy is known to be misleading as a performance measure. While most existing methods for IC resort to optimizing balanced accuracy (i.e., the average of class-wise recalls), they fall short in scenarios where the significance of classes varies or certain metrics should reach prescribed levels. In this paper, we study two key classification metrics, precision and recall, under three practical binary IC settings: fix precision optimize recall (FPOR), fix recall optimize precision (FROP), and optimize $F_\beta$-score (OFBS). Unlike existing methods that rely on smooth approximations to deal with the indicator function involved, \textit{we introduce, for the first time, exact constrained reformulations for these direct metric optimization (DMO) problems}, which can be effectively solved by exact penalty methods. Experiment results on multiple benchmark datasets demonstrate the practical superiority of our approach over the state-of-the-art methods for the three DMO problems. We also expect our exact reformulation and optimization (ERO) framework to be applicable to a wide range of DMO problems for binary IC and beyond. Our code is available at https://github.com/sun-umn/DMO.
Related papers
- DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs [56.4979142807426]
We introduce underlinetextbfDirect Preference Learning with Only underlinetextbfSelf-Generated underlinetextbfTests and underlinetextbfCode (DSTC)<n>DSTC uses only self-generated code snippets and tests to construct reliable preference pairs.
arXiv Detail & Related papers (2024-11-20T02:03:16Z) - POPoS: Improving Efficient and Robust Facial Landmark Detection with Parallel Optimal Position Search [34.50794776762681]
This paper introduces Parallel Optimal Position Search (POPoS), a high-precision encoding-decoding framework.<n>POPoS employs three key contributions: Pseudo-range multilateration is utilized to correct heatmap errors, improving landmark localization accuracy.<n>A single-step parallel computation algorithm is introduced, boosting computational efficiency and reducing processing time.
arXiv Detail & Related papers (2024-10-12T16:28:40Z) - Benchmarking Large Language Model Uncertainty for Prompt Optimization [4.151658495779136]
This paper introduces a benchmark dataset to evaluate uncertainty metrics.<n>We show that current metrics align more with Answer Uncertainty, which reflects output confidence and diversity, rather than Correctness Uncertainty.
arXiv Detail & Related papers (2024-09-16T07:13:30Z) - Geometric-Averaged Preference Optimization for Soft Preference Labels [78.2746007085333]
Many algorithms for aligning LLMs with human preferences assume that human preferences are binary and deterministic.<n>In this work, we introduce the distributional soft preference labels and improve Direct Preference Optimization (DPO) with a weighted geometric average of the LLM output likelihood in the loss function.
arXiv Detail & Related papers (2024-09-10T17:54:28Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Lower-Left Partial AUC: An Effective and Efficient Optimization Metric
for Recommendation [52.45394284415614]
We propose a new optimization metric, Lower-Left Partial AUC (LLPAUC), which is computationally efficient like AUC but strongly correlates with Top-K ranking metrics.
LLPAUC considers only the partial area under the ROC curve in the Lower-Left corner to push the optimization focus on Top-K.
arXiv Detail & Related papers (2024-02-29T13:58:33Z) - Adaptive Neural Ranking Framework: Toward Maximized Business Goal for
Cascade Ranking Systems [33.46891569350896]
Cascade ranking is widely used for large-scale top-k selection problems in online advertising and recommendation systems.
Previous works on learning-to-rank usually focus on letting the model learn the complete order or top-k order.
We name this method as Adaptive Neural Ranking Framework (abbreviated as ARF)
arXiv Detail & Related papers (2023-10-16T14:43:02Z) - Adaptive manifold for imbalanced transductive few-shot learning [16.627512688664513]
We propose a novel algorithm to address imbalanced transductive few-shot learning, named Adaptive Manifold.
Our method exploits the underlying manifold of the labeled support examples and unlabeled queries by using manifold similarity to predict the class probability distribution per query.
arXiv Detail & Related papers (2023-04-27T15:42:49Z) - Accurate and Reliable Methods for 5G UAV Jamming Identification With
Calibrated Uncertainty [3.4208659698673127]
Only increasing accuracy without considering uncertainty may negatively impact Deep Neural Network (DNN) decision-making.
This paper proposes five combined preprocessing and post-processing methods for time-series binary classification problems.
arXiv Detail & Related papers (2022-11-05T15:04:45Z) - Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC)
AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability.
We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z) - Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems.
In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z) - Mix-n-Match: Ensemble and Compositional Methods for Uncertainty
Calibration in Deep Learning [21.08664370117846]
We show how Mix-n-Match calibration strategies can help achieve remarkably better data-efficiency and expressive power.
We also reveal potential issues in standard evaluation practices.
Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks.
arXiv Detail & Related papers (2020-03-16T17:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.