Related papers: Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy

Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy

URL: http://arxiv.org/abs/2502.12125v1
Date: Mon, 17 Feb 2025 18:47:01 GMT
Title: Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy
Authors: Roman Malashin, Valeria Yachnaya, Alexander Mullin,
Abstract summary: We argue that the learning process in classification problems can be understood through the lens of label clustering.<n>Specifically, we observe that networks tend to distinguish higher-level (hypernym) categories in the early stages of training.<n>We introduce a novel framework to track the evolution of the feature manifold during training, revealing how the hierarchy of class relations emerges.
Score: 44.99833362998488
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We investigate the training dynamics of deep classifiers by examining how hierarchical relationships between classes evolve during training. Through extensive experiments, we argue that the learning process in classification problems can be understood through the lens of label clustering. Specifically, we observe that networks tend to distinguish higher-level (hypernym) categories in the early stages of training, and learn more specific (hyponym) categories later. We introduce a novel framework to track the evolution of the feature manifold during training, revealing how the hierarchy of class relations emerges and refines across the network layers. Our analysis demonstrates that the learned representations closely align with the semantic structure of the dataset, providing a quantitative description of the clustering process. Notably, we show that in the hypernym label space, certain properties of neural collapse appear earlier than in the hyponym label space, helping to bridge the gap between the initial and terminal phases of learning. We believe our findings offer new insights into the mechanisms driving hierarchical learning in deep networks, paving the way for future advancements in understanding deep learning dynamics.

Related papers

Semantic Depth Matters: Explaining Errors of Deep Vision Networks through Perceived Class Similarities [0.0]
We introduce a novel framework that investigates the relationship between the semantic hierarchy depth perceived by a network and its real-data misclassification patterns. We propose a graph-based visualization of model semantic relationships and misperceptions. Our approach reveals that deep vision networks encode specific semantic hierarchies and that high semantic depth improves the compliance between perceived class similarities and actual errors.
arXiv Detail & Related papers (2025-04-14T07:44:34Z)
Leveraging Hierarchical Taxonomies in Prompt-based Continual Learning [41.13568563835089]
We find that applying human habits of organizing and connecting information can serve as an efficient strategy when training deep learning models. We propose a novel regularization loss function that encourages models to focus more on challenging knowledge areas.
arXiv Detail & Related papers (2024-10-06T01:30:40Z)
Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination [33.273226655730326]
We show that each layer of a deep linear network progressively compresses within-class features at a geometric rate and discriminates between-class features at a linear rate. This is the first quantitative characterization of feature evolution in hierarchical representations of deep linear networks.
arXiv Detail & Related papers (2023-11-06T09:00:38Z)
How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model [47.617093812158366]
We introduce the Random Hierarchy Model: a family of synthetic tasks inspired by the hierarchical structure of language and images. We find that deep networks learn the task by developing internal representations invariant to exchanging equivalent groups. Our results indicate how deep networks overcome the curse of dimensionality by building invariant representations.
arXiv Detail & Related papers (2023-07-05T09:11:09Z)
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions [48.97008907275482]
Clustering is a fundamental machine learning task which has been widely studied in the literature. Deep Clustering, i.e., jointly optimizing the representation learning and clustering, has been proposed and hence attracted growing attention in the community. We summarize the essential components of deep clustering and categorize existing methods by the ways they design interactions between deep representation learning and clustering.
arXiv Detail & Related papers (2022-06-15T15:05:13Z)
Towards understanding deep learning with the natural clustering prior [3.8073142980733]
This thesis investigates the implicit integration of a natural clustering prior composed of three statements. The decomposition of classes into multiple clusters implies that supervised deep learning systems could benefit from unsupervised clustering to define appropriate decision boundaries. We do so through an extensive empirical study of the training dynamics as well as the neuron- and layer-level representations of deep neural networks.
arXiv Detail & Related papers (2022-03-15T18:07:37Z)
Nearest Class-Center Simplification through Intermediate Layers [0.0]
Recent advances in theoretical Deep Learning have introduced geometric properties that occur during training, past the Interpolation Threshold. We inquire into the phenomena coined Neural Collapse in the intermediate layers of the networks, and emphasize the innerworkings of Nearest Class-Center Mismatch inside the deepnet.
arXiv Detail & Related papers (2022-01-21T23:21:26Z)
Encoding Hierarchical Information in Neural Networks helps in Subpopulation Shift [8.01009207457926]
Deep neural networks have proven to be adept in image classification tasks, often surpassing humans in terms of accuracy. In this work, we study the aforementioned problems through the lens of a novel conditional supervised training framework. We show that learning in this structured hierarchical manner results in networks that are more robust against subpopulation shifts.
arXiv Detail & Related papers (2021-12-20T20:26:26Z)
Long-tail Recognition via Compositional Knowledge Transfer [60.03764547406601]
We introduce a novel strategy for long-tail recognition that addresses the tail classes' few-shot problem. Our objective is to transfer knowledge acquired from information-rich common classes to semantically similar, and yet data-hungry, rare classes. Experiments show that our approach can achieve significant performance boosts on rare classes while maintaining robust common class performance.
arXiv Detail & Related papers (2021-12-13T15:48:59Z)
A Comprehensive Survey on Community Detection with Deep Learning [93.40332347374712]
A community reveals the features and connections of its members that are different from those in other communities in a network. This survey devises and proposes a new taxonomy covering different categories of the state-of-the-art methods. The main category, i.e., deep neural networks, is further divided into convolutional networks, graph attention networks, generative adversarial networks and autoencoders.
arXiv Detail & Related papers (2021-05-26T14:37:07Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
The large learning rate phase of deep learning: the catapult mechanism [50.23041928811575]
We present a class of neural networks with solvable training dynamics. We find good agreement between our model's predictions and training dynamics in realistic deep learning settings. We believe our results shed light on characteristics of models trained at different learning rates.
arXiv Detail & Related papers (2020-03-04T17:52:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.