Encoding Hierarchical Information in Neural Networks helps in
Subpopulation Shift
- URL: http://arxiv.org/abs/2112.10844v1
- Date: Mon, 20 Dec 2021 20:26:26 GMT
- Title: Encoding Hierarchical Information in Neural Networks helps in
Subpopulation Shift
- Authors: Amitangshu Mukherjee, Isha Garg and Kaushik Roy
- Abstract summary: Deep neural networks have proven to be adept in image classification tasks, often surpassing humans in terms of accuracy.
In this work, we study the aforementioned problems through the lens of a novel conditional supervised training framework.
We show that learning in this structured hierarchical manner results in networks that are more robust against subpopulation shifts.
- Score: 8.01009207457926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past decade, deep neural networks have proven to be adept in image
classification tasks, often surpassing humans in terms of accuracy. However,
standard neural networks often fail to understand the concept of hierarchical
structures and dependencies among different classes for vision related tasks.
Humans on the other hand, seem to learn categories conceptually, progressively
growing from understanding high-level concepts down to granular levels of
categories. One of the issues arising from the inability of neural networks to
encode such dependencies within its learned structure is that of subpopulation
shift -- where models are queried with novel unseen classes taken from a
shifted population of the training set categories. Since the neural network
treats each class as independent from all others, it struggles to categorize
shifting populations that are dependent at higher levels of the hierarchy. In
this work, we study the aforementioned problems through the lens of a novel
conditional supervised training framework. We tackle subpopulation shift by a
structured learning procedure that incorporates hierarchical information
conditionally through labels. Furthermore, we introduce a notion of graphical
distance to model the catastrophic effect of mispredictions. We show that
learning in this structured hierarchical manner results in networks that are
more robust against subpopulation shifts, with an improvement of around ~2% in
terms of accuracy and around 8.5\% in terms of graphical distance over standard
models on subpopulation shift benchmarks.
Related papers
- Federated Graph Semantic and Structural Learning [54.97668931176513]
This paper reveals that local client distortion is brought by both node-level semantics and graph-level structure.
We postulate that a well-structural graph neural network possesses similarity for neighbors due to the inherent adjacency relationships.
We transform the adjacency relationships into the similarity distribution and leverage the global model to distill the relation knowledge into the local model.
arXiv Detail & Related papers (2024-06-27T07:08:28Z) - Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model [47.617093812158366]
We introduce the Random Hierarchy Model: a family of synthetic tasks inspired by the hierarchical structure of language and images.
We find that deep networks learn the task by developing internal representations invariant to exchanging equivalent groups.
Our results indicate how deep networks overcome the curse of dimensionality by building invariant representations.
arXiv Detail & Related papers (2023-07-05T09:11:09Z) - Isometric Representations in Neural Networks Improve Robustness [0.0]
We train neural networks to perform classification while simultaneously maintaining within-class metric structure.
We verify that isometric regularization improves the robustness to adversarial attacks on MNIST.
arXiv Detail & Related papers (2022-11-02T16:18:18Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Interpretable part-whole hierarchies and conceptual-semantic
relationships in neural networks [4.153804257347222]
We present Agglomerator, a framework capable of providing a representation of part-whole hierarchies from visual cues.
We evaluate our method on common datasets, such as SmallNORB, MNIST, FashionMNIST, CIFAR-10, and CIFAR-100.
arXiv Detail & Related papers (2022-03-07T10:56:13Z) - Discriminability-enforcing loss to improve representation learning [20.4701676109641]
We introduce a new loss term inspired by the Gini impurity to minimize the entropy of individual high-level features.
Although our Gini loss induces highly-discriminative features, it does not ensure that the distribution of the high-level features matches the distribution of the classes.
Our empirical results show that integrating our novel loss terms into the training objective consistently outperforms the models trained with cross-entropy alone.
arXiv Detail & Related papers (2022-02-14T22:31:37Z) - Computing Class Hierarchies from Classifiers [12.631679928202516]
We propose a novel algorithm for automatically acquiring a class hierarchy from a neural network.
Our algorithm produces surprisingly good hierarchies for some well-known deep neural network models.
arXiv Detail & Related papers (2021-12-02T13:01:04Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Domain Generalization for Medical Imaging Classification with
Linear-Dependency Regularization [59.5104563755095]
We introduce a simple but effective approach to improve the generalization capability of deep neural networks in the field of medical imaging classification.
Motivated by the observation that the domain variability of the medical images is to some extent compact, we propose to learn a representative feature space through variational encoding.
arXiv Detail & Related papers (2020-09-27T12:30:30Z) - Learn Class Hierarchy using Convolutional Neural Networks [0.9569316316728905]
We propose a new architecture for hierarchical classification of images, introducing a stack of deep linear layers with cross-entropy loss functions and center loss combined.
We experimentally show that our hierarchical classifier presents advantages to the traditional classification approaches finding application in computer vision tasks.
arXiv Detail & Related papers (2020-05-18T12:06:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.