Statistical signatures of abstraction in deep neural networks
- URL: http://arxiv.org/abs/2407.01656v2
- Date: Tue, 01 Oct 2024 12:39:15 GMT
- Title: Statistical signatures of abstraction in deep neural networks
- Authors: Carlo Orientale Caputo, Matteo Marsili,
- Abstract summary: We study how abstract representations emerge in a Deep Belief Network (DBN) trained on benchmark datasets.
We show that the representation approaches an universal model determined by the principle of maximal relevance.
We also show that plasticity increases with depth, in a similar way as it does in the brain.
- Score: 0.0
- License:
- Abstract: We study how abstract representations emerge in a Deep Belief Network (DBN) trained on benchmark datasets. Our analysis targets the principles of learning in the early stages of information processing, starting from the "primordial soup" of the under-sampling regime. As the data is processed by deeper and deeper layers, features are detected and removed, transferring more and more "context-invariant" information to deeper layers. We show that the representation approaches an universal model -- the Hierarchical Feature Model (HFM) -- determined by the principle of maximal relevance. Relevance quantifies the uncertainty on the model of the data, thus suggesting that "meaning" -- i.e. syntactic information -- is that part of the data which is not yet captured by a model. Our analysis shows that shallow layers are well described by pairwise Ising models, which provide a representation of the data in terms of generic, low order features. We also show that plasticity increases with depth, in a similar way as it does in the brain. These findings suggest that DBNs are capable of extracting a hierarchy of features from the data which is consistent with the principle of maximal relevance.
Related papers
- Robust Shape Fitting for 3D Scene Abstraction [33.84212609361491]
In particular, we can describe man-made environments using volumetric primitives such as cuboids or cylinders.
We propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids.
Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.
arXiv Detail & Related papers (2024-03-15T16:37:43Z) - Neural Causal Abstractions [63.21695740637627]
We develop a new family of causal abstractions by clustering variables and their domains.
We show that such abstractions are learnable in practical settings through Neural Causal Models.
Our experiments support the theory and illustrate how to scale causal inferences to high-dimensional settings involving image data.
arXiv Detail & Related papers (2024-01-05T02:00:27Z) - Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory.
We show that linear networks make provably optimal predictions at infinite depth.
We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Primitive-based Shape Abstraction via Nonparametric Bayesian Inference [29.7543198254021]
We propose a novel non-parametric Bayesian statistical method to infer an abstraction, consisting of an unknown number of geometric primitives, from a point cloud.
Our method outperforms the state-of-the-art in terms of accuracy and is generalizable to various types of objects.
arXiv Detail & Related papers (2022-03-28T13:00:06Z) - Online Deep Learning based on Auto-Encoder [4.128388784932455]
We propose a two-phase Online Deep Learning based on Auto-Encoder (ODLAE)
Based on auto-encoder, considering reconstruction loss, we extract abstract hierarchical latent representations of instances.
We devise two fusion strategies: the output-level fusion strategy, which is obtained by fusing the classification results of each hidden layer; and feature-level fusion strategy, which is leveraged self-attention mechanism to fusion every hidden layer output.
arXiv Detail & Related papers (2022-01-19T02:14:57Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - A Light-weight Interpretable CompositionalNetwork for Nuclei Detection
and Weakly-supervised Segmentation [10.196621315018884]
Deep neural networks usually require large numbers of annotated data to train vast parameters.
We propose to build a data-efficient model, which only requires partial annotation, specifically on isolated nucleus.
arXiv Detail & Related papers (2021-10-26T16:44:08Z) - Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque.
Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z) - Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks.
We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z) - Extracting Semantic Indoor Maps from Occupancy Grids [2.4214518935746185]
We focus on the semantic mapping of indoor environments.
We propose a method to extract an abstracted floor plan from typical grid maps using Bayesian reasoning.
We demonstrate the effectiveness of the approach using real-world data.
arXiv Detail & Related papers (2020-02-19T18:52:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.