Complexity of Representations in Deep Learning
- URL: http://arxiv.org/abs/2209.00525v1
- Date: Thu, 1 Sep 2022 15:20:21 GMT
- Title: Complexity of Representations in Deep Learning
- Authors: Tin Kam Ho
- Abstract summary: We analyze the effectiveness of the learned representations in separating the classes from a data complexity perspective.
We show how the data complexity evolves through the network, how it changes during training, and how it is impacted by the network design and the availability of training samples.
- Score: 2.0219767626075438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks use multiple layers of functions to map an object
represented by an input vector progressively to different representations, and
with sufficient training, eventually to a single score for each class that is
the output of the final decision function. Ideally, in this output space, the
objects of different classes achieve maximum separation. Motivated by the need
to better understand the inner working of a deep neural network, we analyze the
effectiveness of the learned representations in separating the classes from a
data complexity perspective. Using a simple complexity measure, a popular
benchmarking task, and a well-known architecture design, we show how the data
complexity evolves through the network, how it changes during training, and how
it is impacted by the network design and the availability of training samples.
We discuss the implications of the observations and the potentials for further
studies.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - When Representations Align: Universality in Representation Learning Dynamics [8.188549368578704]
We derive an effective theory of representation learning under the assumption that the encoding map from input to hidden representation and the decoding map from representation to output are arbitrary smooth functions.
We show through experiments that the effective theory describes aspects of representation learning dynamics across a range of deep networks with different activation functions and architectures.
arXiv Detail & Related papers (2024-02-14T12:48:17Z) - On Characterizing the Evolution of Embedding Space of Neural Networks
using Algebraic Topology [9.537910170141467]
We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers.
We demonstrate that as depth increases, a topologically complicated dataset is transformed into a simple one, resulting in Betti numbers attaining their lowest possible value.
arXiv Detail & Related papers (2023-11-08T10:45:12Z) - Visual Analytics of Multivariate Networks with Representation Learning and Composite Variable Construction [19.265502727154473]
This paper presents a visual analytics workflow for studying multivariate networks.
It consists of a neural-network-based learning phase to classify the data, a dimensionality reduction and optimization phase, and an interpreting phase conducted by the user.
A key part of our design is a composite variable construction step that remodels nonlinear features obtained by neural networks into linear features that are intuitive to interpret.
arXiv Detail & Related papers (2023-03-16T18:31:18Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - Network Comparison Study of Deep Activation Feature Discriminability
with Novel Objects [0.5076419064097732]
State-of-the-art computer visions algorithms have incorporated Deep Neural Networks (DNN) in feature extracting roles, creating Deep Convolutional Activation Features (DeCAF)
This study analyzes the general discriminability of novel object visual appearances encoded into the DeCAF space of six of the leading visual recognition DNN architectures.
arXiv Detail & Related papers (2022-02-08T07:40:53Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.