Do We Really Need a Learnable Classifier at the End of Deep Neural
Network?
- URL: http://arxiv.org/abs/2203.09081v1
- Date: Thu, 17 Mar 2022 04:34:28 GMT
- Title: Do We Really Need a Learnable Classifier at the End of Deep Neural
Network?
- Authors: Yibo Yang, Liang Xie, Shixiang Chen, Xiangtai Li, Zhouchen Lin,
Dacheng Tao
- Abstract summary: We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
- Score: 118.18554882199676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern deep neural networks for classification usually jointly learn a
backbone for representation and a linear classifier to output the logit of each
class. A recent study has shown a phenomenon called neural collapse that the
within-class means of features and the classifier vectors converge to the
vertices of a simplex equiangular tight frame (ETF) at the terminal phase of
training on a balanced dataset. Since the ETF geometric structure maximally
separates the pair-wise angles of all classes in the classifier, it is natural
to raise the question, why do we spend an effort to learn a classifier when we
know its optimal geometric structure? In this paper, we study the potential of
learning a neural network for classification with the classifier randomly
initialized as an ETF and fixed during training. Our analytical work based on
the layer-peeled model indicates that the feature learning with a fixed ETF
classifier naturally leads to the neural collapse state even when the dataset
is imbalanced among classes. We further show that in this case the cross
entropy (CE) loss is not necessary and can be replaced by a simple squared loss
that shares the same global optimality but enjoys a more accurate gradient and
better convergence property. Our experimental results show that our method is
able to achieve similar performances on image classification for balanced
datasets, and bring significant improvements in the long-tailed and
fine-grained classification tasks.
Related papers
- Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Feature Model [25.61363481391964]
We show that when the training dataset is class-imbalanced, some Neural Collapse (NC) properties will no longer be true.
In this paper, we generalize NC to imbalanced regime for cross-entropy loss under the unconstrained ReLU feature model.
We find that the weights are aligned to the scaled and centered class-means with scaling factors depend on the number of training samples of each class.
arXiv Detail & Related papers (2024-01-04T04:53:31Z) - No Fear of Classifier Biases: Neural Collapse Inspired Federated
Learning with Synthetic and Fixed Classifier [10.491645205483051]
We propose a solution to the FL's classifier bias problem by utilizing a synthetic and fixed ETF classifier during training.
We devise several effective modules to better adapt the ETF structure in FL, achieving both high generalization and personalization.
Our method achieves state-of-the-art performances on CIFAR-10, CIFAR-100, and Tiny-ImageNet.
arXiv Detail & Related papers (2023-03-17T15:38:39Z) - Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing
Mistake Severity [0.0]
We propose to fix the linear classifier of a deep neural network to a Hierarchy-Aware Frame (HAFrame)
We demonstrate that our approach reduces the mistake severity of the model's predictions while maintaining its top-1 accuracy on several datasets.
arXiv Detail & Related papers (2023-03-10T03:44:01Z) - Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class
Incremental Learning [120.53458753007851]
Few-shot class-incremental learning (FSCIL) has been a challenging problem as only a few training samples are accessible for each novel class in the new sessions.
We deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse.
We propose a neural collapse inspired framework for FSCIL. Experiments on the miniImageNet, CUB-200, and CIFAR-100 datasets demonstrate that our proposed framework outperforms the state-of-the-art performances.
arXiv Detail & Related papers (2023-02-06T18:39:40Z) - Understanding Imbalanced Semantic Segmentation Through Neural Collapse [81.89121711426951]
We show that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes.
We introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure.
Our method ranks 1st and sets a new record on the ScanNet200 test leaderboard.
arXiv Detail & Related papers (2023-01-03T13:51:51Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - A Geometric Analysis of Neural Collapse with Unconstrained Features [40.66585948844492]
We provide the first global optimization landscape analysis of $Neural;Collapse$.
This phenomenon arises in the last-layer classifiers and features of neural networks during the terminal phase of training.
arXiv Detail & Related papers (2021-05-06T00:00:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.