Shadow: A Novel Loss Function for Efficient Training in Siamese Networks
- URL: http://arxiv.org/abs/2311.14012v1
- Date: Thu, 23 Nov 2023 14:07:35 GMT
- Title: Shadow: A Novel Loss Function for Efficient Training in Siamese Networks
- Authors: Alif Elham Khan, Mohammad Junayed Hasan, Humayra Anjum, Nabeel
Mohammed
- Abstract summary: We present a novel loss function called Shadow Loss that compresses the dimensions of an embedding space during loss calculation without loss of performance.
Projecting on a lower-dimension projection space, our loss function converges faster, and the resulting classified image clusters have higher inter-class and smaller intra-class distances.
Shadow Loss consistently performs better than the state-of-the-art Triplet Margin Loss by an accuracy of 5%-10% across diverse datasets.
- Score: 2.2189125306342
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite significant recent advances in similarity detection tasks, existing
approaches pose substantial challenges under memory constraints. One of the
primary reasons for this is the use of computationally expensive metric
learning loss functions such as Triplet Loss in Siamese networks. In this
paper, we present a novel loss function called Shadow Loss that compresses the
dimensions of an embedding space during loss calculation without loss of
performance. The distance between the projections of the embeddings is learned
from inputs on a compact projection space where distances directly correspond
to a measure of class similarity. Projecting on a lower-dimension projection
space, our loss function converges faster, and the resulting classified image
clusters have higher inter-class and smaller intra-class distances. Shadow Loss
not only reduces embedding dimensions favoring memory constraint devices but
also consistently performs better than the state-of-the-art Triplet Margin Loss
by an accuracy of 5\%-10\% across diverse datasets. The proposed loss function
is also model agnostic, upholding its performance across several tested models.
Its effectiveness and robustness across balanced, imbalanced, medical, and
non-medical image datasets suggests that it is not specific to a particular
model or dataset but demonstrates superior performance consistently while using
less memory and computation.
Related papers
- Reduced Jeffries-Matusita distance: A Novel Loss Function to Improve
Generalization Performance of Deep Classification Models [0.0]
We introduce a distance called Reduced Jeffries-Matusita as a loss function for training deep classification models to reduce the over-fitting issue.
The results show that the new distance measure stabilizes the training process significantly, enhances the generalization ability, and improves the performance of the models in the Accuracy and F1-score metrics.
arXiv Detail & Related papers (2024-03-13T10:51:38Z) - Associative Memories in the Feature Space [68.1903319310263]
We propose a class of memory models that only stores low-dimensional semantic embeddings, and uses them to retrieve similar, but not identical, memories.
We demonstrate a proof of concept of this method on a simple task on the MNIST dataset.
arXiv Detail & Related papers (2024-02-16T16:37:48Z) - FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with
Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC)
Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - Nonparametric Classification on Low Dimensional Manifolds using
Overparameterized Convolutional Residual Networks [82.03459331544737]
We study the performance of ConvResNeXts, trained with weight decay from the perspective of nonparametric classification.
Our analysis allows for infinitely many building blocks in ConvResNeXts, and shows that weight decay implicitly enforces sparsity on these blocks.
arXiv Detail & Related papers (2023-07-04T11:08:03Z) - Noise-Robust Loss Functions: Enhancing Bounded Losses for Large-Scale Noisy Data Learning [0.0]
Large annotated datasets inevitably contain noisy labels, which poses a major challenge for training deep neural networks as they easily memorize the labels.
Noise-robust loss functions have emerged as a notable strategy to counteract this issue, but it remains challenging to create a robust loss function which is not susceptible to underfitting.
We propose a novel method denoted as logit bias, which adds a real number $epsilon$ to the logit at the position of the correct class.
arXiv Detail & Related papers (2023-06-08T18:38:55Z) - SuSana Distancia is all you need: Enforcing class separability in metric
learning via two novel distance-based loss functions for few-shot image
classification [0.9236074230806579]
We propose two loss functions which consider the importance of the embedding vectors by looking at the intra-class and inter-class distance between the few data.
Our results show a significant improvement in accuracy in the miniImagenNet benchmark compared to other metric-based few-shot learning methods by a margin of 2%.
arXiv Detail & Related papers (2023-05-15T23:12:09Z) - Instance-Variant Loss with Gaussian RBF Kernel for 3D Cross-modal
Retriveal [52.41252219453429]
Existing methods treat all instances equally, applying the same penalty strength to instances with varying degrees of difficulty.
This can result in ambiguous convergence or local optima, severely compromising the separability of the feature space.
We propose an Instance-Variant loss to assign different penalty strengths to different instances, improving the space separability.
arXiv Detail & Related papers (2023-05-07T10:12:14Z) - SIoU Loss: More Powerful Learning for Bounding Box Regression [0.0]
Loss function SIoU was suggested, where penalty metrics were redefined considering the angle of the vector between the desired regression.
Applied to conventional Neural Networks and datasets it is shown that SIoU improves both the speed of training and the accuracy of the inference.
arXiv Detail & Related papers (2022-05-25T12:46:21Z) - Why Do Better Loss Functions Lead to Less Transferable Features? [93.47297944685114]
This paper studies how the choice of training objective affects the transferability of the hidden representations of convolutional neural networks trained on ImageNet.
We show that many objectives lead to statistically significant improvements in ImageNet accuracy over vanilla softmax cross-entropy, but the resulting fixed feature extractors transfer substantially worse to downstream tasks.
arXiv Detail & Related papers (2020-10-30T17:50:31Z) - Towards Certified Robustness of Distance Metric Learning [53.96113074344632]
We advocate imposing an adversarial margin in the input space so as to improve the generalization and robustness of metric learning algorithms.
We show that the enlarged margin is beneficial to the generalization ability by using the theoretical technique of algorithmic robustness.
arXiv Detail & Related papers (2020-06-10T16:51:53Z) - A Comparison of Metric Learning Loss Functions for End-To-End Speaker
Verification [4.617249742207066]
We compare several metric learning loss functions in a systematic manner on the VoxCeleb dataset.
We show that the additive angular margin loss function outperforms all other loss functions in the study.
Based on a combination of SincNet trainable features and the x-vector architecture, the network used in this paper brings us a step closer to a really-end-to-end speaker verification system.
arXiv Detail & Related papers (2020-03-31T08:36:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.