Reformulating van Rijsbergen's $F_{\beta}$ metric for weighted binary
cross-entropy
- URL: http://arxiv.org/abs/2210.16458v3
- Date: Mon, 13 Nov 2023 01:14:08 GMT
- Title: Reformulating van Rijsbergen's $F_{\beta}$ metric for weighted binary
cross-entropy
- Authors: Satesh Ramdhani
- Abstract summary: This paper investigates incorporating a performance metric alongside differentiable loss functions to inform training outcomes.
The focus is on van Rijsbergens $F_beta$ metric -- a popular choice for gauging classification performance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The separation of performance metrics from gradient based loss functions may
not always give optimal results and may miss vital aggregate information. This
paper investigates incorporating a performance metric alongside differentiable
loss functions to inform training outcomes. The goal is to guide model
performance and interpretation by assuming statistical distributions on this
performance metric for dynamic weighting. The focus is on van Rijsbergens
$F_{\beta}$ metric -- a popular choice for gauging classification performance.
Through distributional assumptions on the $F_{\beta}$, an intermediary link can
be established to the standard binary cross-entropy via dynamic penalty
weights. First, the $F_{\beta}$ metric is reformulated to facilitate assuming
statistical distributions with accompanying proofs for the cumulative density
function. These probabilities are used within a knee curve algorithm to find an
optimal $\beta$ or $\beta_{opt}$. This $\beta_{opt}$ is used as a weight or
penalty in the proposed weighted binary cross-entropy. Experimentation on
publicly available data along with benchmark analysis mostly yields better and
interpretable results as compared to the baseline for both imbalanced and
balanced classes. For example, for the IMDB text data with known labeling
errors, a 14% boost in $F_1$ score is shown. The results also reveal
commonalities between the penalty model families derived in this paper and the
suitability of recall-centric or precision-centric parameters used in the
optimization. The flexibility of this methodology can enhance interpretation.
Related papers
- Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors [58.661454334877256]
Drug-Target binding Affinity (DTA) prediction is essential for drug discovery.
Despite the application of deep learning methods to DTA prediction, the achieved accuracy remain suboptimal.
We propose $k$NN-DTA, a non-representation embedding-based retrieval method adopted on a pre-trained DTA prediction model.
arXiv Detail & Related papers (2024-07-21T15:49:05Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Learning to Re-weight Examples with Optimal Transport for Imbalanced
Classification [74.62203971625173]
Imbalanced data pose challenges for deep learning based classification models.
One of the most widely-used approaches for tackling imbalanced data is re-weighting.
We propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view.
arXiv Detail & Related papers (2022-08-05T01:23:54Z) - Observable adjustments in single-index models for regularized
M-estimators [3.5353632767823506]
In regime where sample size $n$ and dimension $p$ are both increasing, the behavior of the empirical distribution of $hatbeta$ and the predicted values $Xhatbeta$ has been previously characterized.
This paper develops a different theory to describe the empirical distribution of $hatbeta$ and $Xhatbeta$.
arXiv Detail & Related papers (2022-04-14T14:32:02Z) - Linear Speedup in Personalized Collaborative Learning [69.45124829480106]
Personalization in federated learning can improve the accuracy of a model for a user by trading off the model's bias.
We formalize the personalized collaborative learning problem as optimization of a user's objective.
We explore conditions under which we can optimally trade-off their bias for a reduction in variance.
arXiv Detail & Related papers (2021-11-10T22:12:52Z) - A surrogate loss function for optimization of $F_\beta$ score in binary
classification with imbalanced data [0.0]
The gradient paths of the proposed surrogate $F_beta$ loss function approximate the gradient paths of the large sample limit of the $F_beta$ score.
It is demonstrated that the proposed surrogate $F_beta$ loss function is effective for optimizing $F_beta$ scores under class imbalances.
arXiv Detail & Related papers (2021-04-03T18:36:23Z) - A First Step Towards Distribution Invariant Regression Metrics [1.370633147306388]
In classification, it has been stated repeatedly that performance metrics like the F-Measure and Accuracy are highly dependent on the class distribution.
We show that the same problem exists in regression. The distribution of odometry parameters in robotic applications can for example largely vary between different recording sessions.
Here, we need regression algorithms that either perform equally well for all function values, or that focus on certain boundary regions like high speed.
arXiv Detail & Related papers (2020-09-10T23:40:46Z) - A Precise High-Dimensional Asymptotic Theory for Boosting and
Minimum-$\ell_1$-Norm Interpolated Classifiers [3.167685495996986]
This paper establishes a precise high-dimensional theory for boosting on separable data.
Under a class of statistical models, we provide an exact analysis of the universality error of boosting.
We also explicitly pin down the relation between the boosting test error and the optimal Bayes error.
arXiv Detail & Related papers (2020-02-05T00:24:53Z) - The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of
Mislabeling [0.0]
We introduce the Real-World- Weight Crossentropy loss function, in both binary and single-label classification variants.
Both variants allow direct input of real world costs as weights.
For single-label, multicategory classification, our loss function also allows directization of probabilistic false positives, weighted by label, during the training of a machine learning model.
arXiv Detail & Related papers (2020-01-03T08:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.