Related papers: Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity

Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity

URL: http://arxiv.org/abs/2303.05689v2
Date: Wed, 9 Aug 2023 17:31:20 GMT
Title: Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity
Authors: Tong Liang and Jim Davis
Abstract summary: We propose to fix the linear classifier of a deep neural network to a Hierarchy-Aware Frame (HAFrame) We demonstrate that our approach reduces the mistake severity of the model's predictions while maintaining its top-1 accuracy on several datasets.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: There is a recently discovered and intriguing phenomenon called Neural Collapse: at the terminal phase of training a deep neural network for classification, the within-class penultimate feature means and the associated classifier vectors of all flat classes collapse to the vertices of a simplex Equiangular Tight Frame (ETF). Recent work has tried to exploit this phenomenon by fixing the related classifier weights to a pre-computed ETF to induce neural collapse and maximize the separation of the learned features when training with imbalanced data. In this work, we propose to fix the linear classifier of a deep neural network to a Hierarchy-Aware Frame (HAFrame), instead of an ETF, and use a cosine similarity-based auxiliary loss to learn hierarchy-aware penultimate features that collapse to the HAFrame. We demonstrate that our approach reduces the mistake severity of the model's predictions while maintaining its top-1 accuracy on several datasets of varying scales with hierarchies of heights ranging from 3 to 12. Code: https://github.com/ltong1130ztr/HAFrame

Related papers

Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame [34.309687104447114]
Neural Collapse (NC) is a recently observed phenomenon in neural networks that characterises the solution space of the final classifier layer when trained until zero training loss. We introduce the notion of nearest simplex ETF geometry for the penultimate layer features at any given training iteration. At each iteration, the classifier weights are implicitly set to the nearest simplex ETF by solving this inner-optimisation. Our experiments on synthetic and real-world architectures for classification tasks demonstrate that our approach accelerates convergence and enhances training stability.
arXiv Detail & Related papers (2024-11-02T13:54:31Z)
Neural Metamorphosis [72.88137795439407]
This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks. NeuMeta directly learns the continuous weight manifold of neural networks. It sustains full-size performance even at a 75% compression rate.
arXiv Detail & Related papers (2024-10-10T14:49:58Z)
Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse [32.06666853127924]
Deep neural networks (DNNs) at convergence consistently represent the training data in the last layer via a symmetric geometric structure referred to as neural collapse. Here, the features of the penultimate layer are free variables, which makes the model data-agnostic and, hence, puts into question its ability to capture training. We first prove generic guarantees on neural collapse that assume (i) low training error and balancedness of the linear layers, and (ii) bounded conditioning of the features before the linear part.
arXiv Detail & Related papers (2024-10-07T10:16:40Z)
Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Feature Model [25.61363481391964]
We show that when the training dataset is class-imbalanced, some Neural Collapse (NC) properties will no longer be true. In this paper, we generalize NC to imbalanced regime for cross-entropy loss under the unconstrained ReLU feature model. We find that the weights are aligned to the scaled and centered class-means with scaling factors depend on the number of training samples of each class.
arXiv Detail & Related papers (2024-01-04T04:53:31Z)
Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class Incremental Learning [120.53458753007851]
Few-shot class-incremental learning (FSCIL) has been a challenging problem as only a few training samples are accessible for each novel class in the new sessions. We deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse. We propose a neural collapse inspired framework for FSCIL. Experiments on the miniImageNet, CUB-200, and CIFAR-100 datasets demonstrate that our proposed framework outperforms the state-of-the-art performances.
arXiv Detail & Related papers (2023-02-06T18:39:40Z)
Understanding Imbalanced Semantic Segmentation Through Neural Collapse [81.89121711426951]
We show that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes. We introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure. Our method ranks 1st and sets a new record on the ScanNet200 test leaderboard.
arXiv Detail & Related papers (2023-01-03T13:51:51Z)
Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum. Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels. They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z)
Do We Really Need a Learnable Classifier at the End of Deep Neural Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training. Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z)
Extended Unconstrained Features Model for Exploring Deep Neural Collapse [59.59039125375527]
Recently, a phenomenon termed "neural collapse" (NC) has been empirically observed in deep neural networks. Recent papers have shown that minimizers with this structure emerge when optimizing a simplified "unconstrained features model" In this paper, we study the UFM for the regularized MSE loss, and show that the minimizers' features can be more structured than in the cross-entropy case.
arXiv Detail & Related papers (2022-02-16T14:17:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.