Related papers: Do We Really Need a Learnable Classifier at the End of Deep Neural Network?

Do We Really Need a Learnable Classifier at the End of Deep Neural Network?

URL: http://arxiv.org/abs/2203.09081v1
Date: Thu, 17 Mar 2022 04:34:28 GMT
Title: Do We Really Need a Learnable Classifier at the End of Deep Neural Network?
Authors: Yibo Yang, Liang Xie, Shixiang Chen, Xiangtai Li, Zhouchen Lin, Dacheng Tao
Abstract summary: We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training. Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
Score: 118.18554882199676
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern deep neural networks for classification usually jointly learn a backbone for representation and a linear classifier to output the logit of each class. A recent study has shown a phenomenon called neural collapse that the within-class means of features and the classifier vectors converge to the vertices of a simplex equiangular tight frame (ETF) at the terminal phase of training on a balanced dataset. Since the ETF geometric structure maximally separates the pair-wise angles of all classes in the classifier, it is natural to raise the question, why do we spend an effort to learn a classifier when we know its optimal geometric structure? In this paper, we study the potential of learning a neural network for classification with the classifier randomly initialized as an ETF and fixed during training. Our analytical work based on the layer-peeled model indicates that the feature learning with a fixed ETF classifier naturally leads to the neural collapse state even when the dataset is imbalanced among classes. We further show that in this case the cross entropy (CE) loss is not necessary and can be replaced by a simple squared loss that shares the same global optimality but enjoys a more accurate gradient and better convergence property. Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets, and bring significant improvements in the long-tailed and fine-grained classification tasks.

Related papers

Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding. The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data. We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z)
Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Feature Model [25.61363481391964]
We show that when the training dataset is class-imbalanced, some Neural Collapse (NC) properties will no longer be true. In this paper, we generalize NC to imbalanced regime for cross-entropy loss under the unconstrained ReLU feature model. We find that the weights are aligned to the scaled and centered class-means with scaling factors depend on the number of training samples of each class.
arXiv Detail & Related papers (2024-01-04T04:53:31Z)
No Fear of Classifier Biases: Neural Collapse Inspired Federated Learning with Synthetic and Fixed Classifier [10.491645205483051]
We propose a solution to the FL's classifier bias problem by utilizing a synthetic and fixed ETF classifier during training. We devise several effective modules to better adapt the ETF structure in FL, achieving both high generalization and personalization. Our method achieves state-of-the-art performances on CIFAR-10, CIFAR-100, and Tiny-ImageNet.
arXiv Detail & Related papers (2023-03-17T15:38:39Z)
Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity [0.0]
We propose to fix the linear classifier of a deep neural network to a Hierarchy-Aware Frame (HAFrame) We demonstrate that our approach reduces the mistake severity of the model's predictions while maintaining its top-1 accuracy on several datasets.
arXiv Detail & Related papers (2023-03-10T03:44:01Z)
Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class Incremental Learning [120.53458753007851]
Few-shot class-incremental learning (FSCIL) has been a challenging problem as only a few training samples are accessible for each novel class in the new sessions. We deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse. We propose a neural collapse inspired framework for FSCIL. Experiments on the miniImageNet, CUB-200, and CIFAR-100 datasets demonstrate that our proposed framework outperforms the state-of-the-art performances.
arXiv Detail & Related papers (2023-02-06T18:39:40Z)
Understanding Imbalanced Semantic Segmentation Through Neural Collapse [81.89121711426951]
We show that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes. We introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure. Our method ranks 1st and sets a new record on the ScanNet200 test leaderboard.
arXiv Detail & Related papers (2023-01-03T13:51:51Z)
Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set. We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps. Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z)
No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data. We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model. Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z)
A Geometric Analysis of Neural Collapse with Unconstrained Features [40.66585948844492]
We provide the first global optimization landscape analysis of $Neural;Collapse$. This phenomenon arises in the last-layer classifiers and features of neural networks during the terminal phase of training.
arXiv Detail & Related papers (2021-05-06T00:00:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.