Related papers: A comprehensive theoretical framework for the optimization of neural networks classification performance with respect to weighted metrics

A comprehensive theoretical framework for the optimization of neural networks classification performance with respect to weighted metrics

URL: http://arxiv.org/abs/2305.13472v1
Date: Mon, 22 May 2023 20:33:29 GMT
Title: A comprehensive theoretical framework for the optimization of neural networks classification performance with respect to weighted metrics
Authors: Francesco Marchetti, Sabrina Guastavino, Cristina Campi, Federico Benvenuto, Michele Piana
Abstract summary: In many contexts, customized and weighted classification scores are designed in order to evaluate the goodness of predictions carried out by neural networks. We provide a complete setting that formalizes weighted classification metrics and allows the construction of losses that drive the model to optimize these interest.
Score: 1.0499611180329804
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In many contexts, customized and weighted classification scores are designed in order to evaluate the goodness of the predictions carried out by neural networks. However, there exists a discrepancy between the maximization of such scores and the minimization of the loss function in the training phase. In this paper, we provide a complete theoretical setting that formalizes weighted classification metrics and then allows the construction of losses that drive the model to optimize these metrics of interest. After a detailed theoretical analysis, we show that our framework includes as particular instances well-established approaches such as classical cost-sensitive learning, weighted cross entropy loss functions and value-weighted skill scores.

Related papers

Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries [10.209740962369453]
Sparse regularization techniques are well-established in machine learning, yet their application in neural networks remains challenging. A promising alternative is shallow weight factorization, where weights are pruning into two factors, allowing for optimization of $L$penalized neural networks. In this work, we introduce deep weight factorization, adding differenti factors to more than two previous approaches.
arXiv Detail & Related papers (2025-02-04T17:12:56Z)
Optimizing Decentralized Online Learning for Supervised Regression and Classification Problems [0.0]
Decentralized learning networks aim to synthesize a single network inference from a set of raw inferences provided by multiple participants. Despite the increased prevalence of decentralized learning networks, there exists no systematic study that performs a calibration of the associated free parameters. Here we present an optimization framework for key parameters governing decentralized online learning in supervised regression and classification problems.
arXiv Detail & Related papers (2025-01-27T21:36:54Z)
Component-based Sketching for Deep ReLU Nets [55.404661149594375]
We develop a sketching scheme based on deep net components for various tasks. We transform deep net training into a linear empirical risk minimization problem. We show that the proposed component-based sketching provides almost optimal rates in approximating saturated functions.
arXiv Detail & Related papers (2024-09-21T15:30:43Z)
On the Generalization Ability of Unsupervised Pretraining [53.06175754026037]
Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization. This paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase. Our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.
arXiv Detail & Related papers (2024-03-11T16:23:42Z)
On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics. The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z)
Random Linear Projections Loss for Hyperplane-Based Optimization in Neural Networks [22.348887008547653]
This work introduces Random Linear Projections (RLP) loss, a novel approach that enhances training efficiency by leveraging geometric relationships within the data. Our empirical evaluations, conducted across benchmark datasets and synthetic examples, demonstrate that neural networks trained with RLP loss outperform those trained with traditional loss functions.
arXiv Detail & Related papers (2023-11-21T05:22:39Z)
A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime. We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z)
Unleashing the power of Neural Collapse for Transferability Estimation [42.09673383041276]
Well-trained models exhibit the phenomenon of Neural Collapse. We propose a novel method termed Fair Collapse (FaCe) for transferability estimation. FaCe yields state-of-the-art performance on different tasks including image classification, semantic segmentation, and text classification.
arXiv Detail & Related papers (2023-10-09T14:30:10Z)
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
Regularization, early-stopping and dreaming: a Hopfield-like setup to address generalization and overfitting [0.0]
We look for optimal network parameters by applying a gradient descent over a regularized loss function. Within this framework, the optimal neuron-interaction matrices correspond to Hebbian kernels revised by a reiterated unlearning protocol.
arXiv Detail & Related papers (2023-08-01T15:04:30Z)
Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks. We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z)
Margin-Based Transfer Bounds for Meta Learning with Deep Feature Embedding [67.09827634481712]
We leverage margin theory and statistical learning theory to establish three margin-based transfer bounds for meta-learning based multiclass classification (MLMC) These bounds reveal that the expected error of a given classification algorithm for a future task can be estimated with the average empirical error on a finite number of previous tasks. Experiments on three benchmarks show that these margin-based models still achieve competitive performance.
arXiv Detail & Related papers (2020-12-02T23:50:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.