Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing
Mistake Severity
- URL: http://arxiv.org/abs/2303.05689v2
- Date: Wed, 9 Aug 2023 17:31:20 GMT
- Title: Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing
Mistake Severity
- Authors: Tong Liang and Jim Davis
- Abstract summary: We propose to fix the linear classifier of a deep neural network to a Hierarchy-Aware Frame (HAFrame)
We demonstrate that our approach reduces the mistake severity of the model's predictions while maintaining its top-1 accuracy on several datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: There is a recently discovered and intriguing phenomenon called Neural
Collapse: at the terminal phase of training a deep neural network for
classification, the within-class penultimate feature means and the associated
classifier vectors of all flat classes collapse to the vertices of a simplex
Equiangular Tight Frame (ETF). Recent work has tried to exploit this phenomenon
by fixing the related classifier weights to a pre-computed ETF to induce neural
collapse and maximize the separation of the learned features when training with
imbalanced data. In this work, we propose to fix the linear classifier of a
deep neural network to a Hierarchy-Aware Frame (HAFrame), instead of an ETF,
and use a cosine similarity-based auxiliary loss to learn hierarchy-aware
penultimate features that collapse to the HAFrame. We demonstrate that our
approach reduces the mistake severity of the model's predictions while
maintaining its top-1 accuracy on several datasets of varying scales with
hierarchies of heights ranging from 3 to 12. Code:
https://github.com/ltong1130ztr/HAFrame
Related papers
- Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame [34.309687104447114]
Neural Collapse (NC) is a recently observed phenomenon in neural networks that characterises the solution space of the final classifier layer when trained until zero training loss.
We introduce the notion of nearest simplex ETF geometry for the penultimate layer features at any given training iteration.
At each iteration, the classifier weights are implicitly set to the nearest simplex ETF by solving this inner-optimisation.
Our experiments on synthetic and real-world architectures for classification tasks demonstrate that our approach accelerates convergence and enhances training stability.
arXiv Detail & Related papers (2024-11-02T13:54:31Z) - Neural Metamorphosis [72.88137795439407]
This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks.
NeuMeta directly learns the continuous weight manifold of neural networks.
It sustains full-size performance even at a 75% compression rate.
arXiv Detail & Related papers (2024-10-10T14:49:58Z) - Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse [32.06666853127924]
Deep neural networks (DNNs) at convergence consistently represent the training data in the last layer via a symmetric geometric structure referred to as neural collapse.
Here, the features of the penultimate layer are free variables, which makes the model data-agnostic and, hence, puts into question its ability to capture training.
We first prove generic guarantees on neural collapse that assume (i) low training error and balancedness of the linear layers, and (ii) bounded conditioning of the features before the linear part.
arXiv Detail & Related papers (2024-10-07T10:16:40Z) - Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Feature Model [25.61363481391964]
We show that when the training dataset is class-imbalanced, some Neural Collapse (NC) properties will no longer be true.
In this paper, we generalize NC to imbalanced regime for cross-entropy loss under the unconstrained ReLU feature model.
We find that the weights are aligned to the scaled and centered class-means with scaling factors depend on the number of training samples of each class.
arXiv Detail & Related papers (2024-01-04T04:53:31Z) - Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class
Incremental Learning [120.53458753007851]
Few-shot class-incremental learning (FSCIL) has been a challenging problem as only a few training samples are accessible for each novel class in the new sessions.
We deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse.
We propose a neural collapse inspired framework for FSCIL. Experiments on the miniImageNet, CUB-200, and CIFAR-100 datasets demonstrate that our proposed framework outperforms the state-of-the-art performances.
arXiv Detail & Related papers (2023-02-06T18:39:40Z) - Understanding Imbalanced Semantic Segmentation Through Neural Collapse [81.89121711426951]
We show that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes.
We introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure.
Our method ranks 1st and sets a new record on the ScanNet200 test leaderboard.
arXiv Detail & Related papers (2023-01-03T13:51:51Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - Extended Unconstrained Features Model for Exploring Deep Neural Collapse [59.59039125375527]
Recently, a phenomenon termed "neural collapse" (NC) has been empirically observed in deep neural networks.
Recent papers have shown that minimizers with this structure emerge when optimizing a simplified "unconstrained features model"
In this paper, we study the UFM for the regularized MSE loss, and show that the minimizers' features can be more structured than in the cross-entropy case.
arXiv Detail & Related papers (2022-02-16T14:17:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.