Related papers: Redesigning the classification layer by randomizing the class representation vectors

Redesigning the classification layer by randomizing the class representation vectors

URL: http://arxiv.org/abs/2011.08704v2
Date: Sun, 29 Nov 2020 08:32:23 GMT
Title: Redesigning the classification layer by randomizing the class representation vectors
Authors: Gabi Shalev and Gal-Lev Shalev and Joseph Keshet
Abstract summary: We analyze how simple design choices for the classification layer affect the learning dynamics. We show that the standard cross-entropy training implicitly captures visual similarities between different classes. We propose to draw the class vectors randomly and set them as fixed during training, thus invalidating the visual similarities encoded in these vectors.
Score: 12.953517767147998
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural image classification models typically consist of two components. The first is an image encoder, which is responsible for encoding a given raw image into a representative vector. The second is the classification component, which is often implemented by projecting the representative vector onto target class vectors. The target class vectors, along with the rest of the model parameters, are estimated so as to minimize the loss function. In this paper, we analyze how simple design choices for the classification layer affect the learning dynamics. We show that the standard cross-entropy training implicitly captures visual similarities between different classes, which might deteriorate accuracy or even prevents some models from converging. We propose to draw the class vectors randomly and set them as fixed during training, thus invalidating the visual similarities encoded in these vectors. We analyze the effects of keeping the class vectors fixed and show that it can increase the inter-class separability, intra-class compactness, and the overall model accuracy, while maintaining the robustness to image corruptions and the generalization of the learned concepts.

Related papers

Approximate Size Targets Are Sufficient for Accurate Semantic Segmentation [52.239136918460616]
Extending binary class tags to approximate relative object-size distributions allows off-the-shelf architectures to solve the segmentation problem. A straightforward zero-avoiding KL-divergence loss for average predictions produces segmentation accuracy comparable to the standard pixel-precise supervision. Our ideas are validated on PASCAL VOC using our new human annotations of approximate object sizes.
arXiv Detail & Related papers (2025-03-10T06:02:13Z)
FocalCount: Towards Class-Count Imbalance in Class-Agnostic Counting [34.25988613929425]
In class-agnostic object counting, the goal is to estimate the total number of object instances in an image without distinguishing between specific categories. Our approach significantly improves the model's ability to distinguish between specific classes and general counts.
arXiv Detail & Related papers (2025-02-15T05:24:43Z)
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling [51.4661186662329]
We introduce aTLAS, an algorithm that linearly combines parameter blocks with different learned coefficients, resulting in anisotropic scaling at the task vector level. We show that such linear combinations explicitly exploit the low intrinsicity of pre-trained models, with only a few coefficients being the learnable parameters. We demonstrate the effectiveness of our method in task arithmetic, few-shot recognition and test-time adaptation, with supervised or unsupervised objectives.
arXiv Detail & Related papers (2024-07-03T07:54:08Z)
Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification [11.072083437769093]
We propose a novel model named SharpReCL for imbalanced text classification tasks. Our model even outperforms popular large language models across several datasets.
arXiv Detail & Related papers (2024-05-19T11:33:49Z)
Ablation Study to Clarify the Mechanism of Object Segmentation in Multi-Object Representation Learning [3.921076451326107]
Multi-object representation learning aims to represent complex real-world visual input using the composition of multiple objects. It is not clear how previous methods have achieved the appropriate segmentation of individual objects. Most of the previous methods regularize the latent vectors using a Variational Autoencoder (VAE)
arXiv Detail & Related papers (2023-10-05T02:59:48Z)
Unicom: Universal and Compact Representation Learning for Image Retrieval [65.96296089560421]
We cluster the large-scale LAION400M into one million pseudo classes based on the joint textual and visual features extracted by the CLIP model. To alleviate such conflict, we randomly select partial inter-class prototypes to construct the margin-based softmax loss. Our method significantly outperforms state-of-the-art unsupervised and supervised image retrieval approaches on multiple benchmarks.
arXiv Detail & Related papers (2023-04-12T14:25:52Z)
Neural Representations Reveal Distinct Modes of Class Fitting in Residual Convolutional Networks [5.1271832547387115]
We leverage probabilistic models of neural representations to investigate how residual networks fit classes. We find that classes in the investigated models are not fitted in an uniform way. We show that the uncovered structure in neural representations correlate with robustness of training examples and adversarial memorization.
arXiv Detail & Related papers (2022-12-01T18:55:58Z)
Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification. The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z)
On the rate of convergence of a classifier based on a Transformer encoder [55.41148606254641]
The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the aposteriori probability satisfies a suitable hierarchical composition model.
arXiv Detail & Related papers (2021-11-29T14:58:29Z)
GAN for Vision, KG for Relation: a Two-stage Deep Network for Zero-shot Action Recognition [33.23662792742078]
We propose a two-stage deep neural network for zero-shot action recognition. In the sampling stage, we utilize a generative adversarial networks (GAN) trained by action features and word vectors of seen classes. In the classification stage, we construct a knowledge graph based on the relationship between word vectors of action classes and related objects.
arXiv Detail & Related papers (2021-05-25T09:34:42Z)
Counterfactual Generative Networks [59.080843365828756]
We propose to decompose the image generation process into independent causal mechanisms that we train without direct supervision. By exploiting appropriate inductive biases, these mechanisms disentangle object shape, object texture, and background. We show that the counterfactual images can improve out-of-distribution with a marginal drop in performance on the original classification task.
arXiv Detail & Related papers (2021-01-15T10:23:12Z)
Learning and Evaluating Representations for Deep One-class Classification [59.095144932794646]
We present a two-stage framework for deep one-class classification. We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations. In experiments, we demonstrate state-of-the-art performance on visual domain one-class classification benchmarks.
arXiv Detail & Related papers (2020-11-04T23:33:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.