Large Margin Discriminative Loss for Classification
- URL: http://arxiv.org/abs/2405.18499v1
- Date: Tue, 28 May 2024 18:10:45 GMT
- Title: Large Margin Discriminative Loss for Classification
- Authors: Hai-Vy Nguyen, Fabrice Gamboa, Sixin Zhang, Reda Chhaibi, Serge Gratton, Thierry Giaccone,
- Abstract summary: We introduce a novel discriminative loss function with large margin in the context of Deep Learning.
This loss boosts the discriminative power of neural nets, represented by intra-class compactness and inter-class separability.
- Score: 3.3975558777609915
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we introduce a novel discriminative loss function with large margin in the context of Deep Learning. This loss boosts the discriminative power of neural nets, represented by intra-class compactness and inter-class separability. On the one hand, the class compactness is ensured by close distance of samples of the same class to each other. On the other hand, the inter-class separability is boosted by a margin loss that ensures the minimum distance of each class to its closest boundary. All the terms in our loss have an explicit meaning, giving a direct view of the feature space obtained. We analyze mathematically the relation between compactness and margin term, giving a guideline about the impact of the hyper-parameters on the learned features. Moreover, we also analyze properties of the gradient of the loss with respect to the parameters of the neural net. Based on this, we design a strategy called partial momentum updating that enjoys simultaneously stability and consistency in training. Furthermore, we also investigate generalization errors to have better theoretical insights. Our loss function systematically boosts the test accuracy of models compared to the standard softmax loss in our experiments.
Related papers
- On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Learning Towards the Largest Margins [83.7763875464011]
Loss function should promote the largest possible margins for both classes and samples.
Not only does this principled framework offer new perspectives to understand and interpret existing margin-based losses, but it can guide the design of new tools.
arXiv Detail & Related papers (2022-06-23T10:03:03Z) - On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods.
We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z) - Discriminability-enforcing loss to improve representation learning [20.4701676109641]
We introduce a new loss term inspired by the Gini impurity to minimize the entropy of individual high-level features.
Although our Gini loss induces highly-discriminative features, it does not ensure that the distribution of the high-level features matches the distribution of the classes.
Our empirical results show that integrating our novel loss terms into the training objective consistently outperforms the models trained with cross-entropy alone.
arXiv Detail & Related papers (2022-02-14T22:31:37Z) - Understanding Square Loss in Training Overparametrized Neural Network
Classifiers [31.319145959402462]
We contribute to the theoretical understanding of square loss in classification by systematically investigating how it performs for overparametrized neural networks.
We consider two cases, according to whether classes are separable or not. In the general non-separable case, fast convergence rate is established for both misclassification rate and calibration error.
The resulting margin is proven to be lower bounded away from zero, providing theoretical guarantees for robustness.
arXiv Detail & Related papers (2021-12-07T12:12:30Z) - Distribution of Classification Margins: Are All Data Equal? [61.16681488656473]
We motivate theoretically and show empirically that the area under the curve of the margin distribution on the training set is in fact a good measure of generalization.
The resulting subset of "high capacity" features is not consistent across different training runs.
arXiv Detail & Related papers (2021-07-21T16:41:57Z) - Fundamental Limits and Tradeoffs in Invariant Representation Learning [99.2368462915979]
Many machine learning applications involve learning representations that achieve two competing goals.
Minimax game-theoretic formulation represents a fundamental tradeoff between accuracy and invariance.
We provide an information-theoretic analysis of this general and important problem under both classification and regression settings.
arXiv Detail & Related papers (2020-12-19T15:24:04Z) - $\sigma^2$R Loss: a Weighted Loss by Multiplicative Factors using
Sigmoidal Functions [0.9569316316728905]
We introduce a new loss function called squared reduction loss ($sigma2$R loss), which is regulated by a sigmoid function to inflate/deflate the error per instance.
Our loss has clear intuition and geometric interpretation, we demonstrate by experiments the effectiveness of our proposal.
arXiv Detail & Related papers (2020-09-18T12:34:40Z) - Adversarially Robust Learning via Entropic Regularization [31.6158163883893]
We propose a new family of algorithms, ATENT, for training adversarially robust deep neural networks.
Our approach achieves competitive (or better) performance in terms of robust classification accuracy.
arXiv Detail & Related papers (2020-08-27T18:54:43Z) - Towards Certified Robustness of Distance Metric Learning [53.96113074344632]
We advocate imposing an adversarial margin in the input space so as to improve the generalization and robustness of metric learning algorithms.
We show that the enlarged margin is beneficial to the generalization ability by using the theoretical technique of algorithmic robustness.
arXiv Detail & Related papers (2020-06-10T16:51:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.