A Law of Data Separation in Deep Learning
- URL: http://arxiv.org/abs/2210.17020v2
- Date: Fri, 11 Aug 2023 00:47:30 GMT
- Title: A Law of Data Separation in Deep Learning
- Authors: Hangfeng He and Weijie J. Su
- Abstract summary: We study the fundamental question of how deep neural networks process data in the intermediate layers.
Our finding is a simple and quantitative law that governs how deep neural networks separate data according to class membership.
- Score: 41.58856318262069
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: While deep learning has enabled significant advances in many areas of
science, its black-box nature hinders architecture design for future artificial
intelligence applications and interpretation for high-stakes decision makings.
We addressed this issue by studying the fundamental question of how deep neural
networks process data in the intermediate layers. Our finding is a simple and
quantitative law that governs how deep neural networks separate data according
to class membership throughout all layers for classification. This law shows
that each layer improves data separation at a constant geometric rate, and its
emergence is observed in a collection of network architectures and datasets
during training. This law offers practical guidelines for designing
architectures, improving model robustness and out-of-sample performance, as
well as interpreting the predictions.
Related papers
- Data Augmentations in Deep Weight Spaces [89.45272760013928]
We introduce a novel augmentation scheme based on the Mixup method.
We evaluate the performance of these techniques on existing benchmarks as well as new benchmarks we generate.
arXiv Detail & Related papers (2023-11-15T10:43:13Z) - Understanding Deep Representation Learning via Layerwise Feature
Compression and Discrimination [33.273226655730326]
We show that each layer of a deep linear network progressively compresses within-class features at a geometric rate and discriminates between-class features at a linear rate.
This is the first quantitative characterization of feature evolution in hierarchical representations of deep linear networks.
arXiv Detail & Related papers (2023-11-06T09:00:38Z) - Homological Convolutional Neural Networks [4.615338063719135]
We propose a novel deep learning architecture that exploits the data structural organization through topologically constrained network representations.
We test our model on 18 benchmark datasets against 5 classic machine learning and 3 deep learning models.
arXiv Detail & Related papers (2023-08-26T08:48:51Z) - Hidden Classification Layers: Enhancing linear separability between
classes in neural networks layers [0.0]
We investigate the impact on deep network performances of a training approach.
We propose a neural network architecture which induces an error function involving the outputs of all the network layers.
arXiv Detail & Related papers (2023-06-09T10:52:49Z) - Deep neural networks architectures from the perspective of manifold
learning [0.0]
This paper is a comprehensive comparison and description of neural network architectures in terms of ge-ometry and topology.
We focus on the internal representation of neural networks and on the dynamics of changes in the topology and geometry of a data manifold on different layers.
arXiv Detail & Related papers (2023-06-06T04:57:39Z) - Neural Architecture Search for Dense Prediction Tasks in Computer Vision [74.9839082859151]
Deep learning has led to a rising demand for neural network architecture engineering.
neural architecture search (NAS) aims at automatically designing neural network architectures in a data-driven manner rather than manually.
NAS has become applicable to a much wider range of problems in computer vision.
arXiv Detail & Related papers (2022-02-15T08:06:50Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Introducing Fuzzy Layers for Deep Learning [5.209583609264815]
We introduce a new layer to deep learning: the fuzzy layer.
Traditionally, the network architecture of neural networks is composed of an input layer, some combination of hidden layers, and an output layer.
We propose the introduction of fuzzy layers into the deep learning architecture to exploit the powerful aggregation properties expressed through fuzzy methodologies.
arXiv Detail & Related papers (2020-02-21T19:33:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.