Related papers: Towards understanding deep learning with the natural clustering prior

Towards understanding deep learning with the natural clustering prior

URL: http://arxiv.org/abs/2203.08174v1
Date: Tue, 15 Mar 2022 18:07:37 GMT
Title: Towards understanding deep learning with the natural clustering prior
Authors: Simon Carbonnelle
Abstract summary: This thesis investigates the implicit integration of a natural clustering prior composed of three statements. The decomposition of classes into multiple clusters implies that supervised deep learning systems could benefit from unsupervised clustering to define appropriate decision boundaries. We do so through an extensive empirical study of the training dynamics as well as the neuron- and layer-level representations of deep neural networks.
Score: 3.8073142980733
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The prior knowledge (a.k.a. priors) integrated into the design of a machine learning system strongly influences its generalization abilities. In the specific context of deep learning, some of these priors are poorly understood as they implicitly emerge from the successful heuristics and tentative approximations of biological brains involved in deep learning design. Through the lens of supervised image classification problems, this thesis investigates the implicit integration of a natural clustering prior composed of three statements: (i) natural images exhibit a rich clustered structure, (ii) image classes are composed of multiple clusters and (iii) each cluster contains examples from a single class. The decomposition of classes into multiple clusters implies that supervised deep learning systems could benefit from unsupervised clustering to define appropriate decision boundaries. Hence, this thesis attempts to identify implicit clustering abilities, mechanisms and hyperparameters in deep learning systems and evaluate their relevance for explaining the generalization abilities of these systems. We do so through an extensive empirical study of the training dynamics as well as the neuron- and layer-level representations of deep neural networks. The resulting collection of experiments provides preliminary evidence for the relevance of the natural clustering prior for understanding deep learning.

Related papers

Concept-Guided Interpretability via Neural Chunking [54.73787666584143]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract these emerging entities, complementing each other based on label availability and dimensionality.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z)
Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy [44.99833362998488]
We argue that the learning process in classification problems can be understood through the lens of label clustering. Specifically, we observe that networks tend to distinguish higher-level (hypernym) categories in the early stages of training. We introduce a novel framework to track the evolution of the feature manifold during training, revealing how the hierarchy of class relations emerges.
arXiv Detail & Related papers (2025-02-17T18:47:01Z)
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition [83.13280812128411]
Recent studies have uncovered intriguing phenomena in deep learning, such as grokking, double descent, and emergent abilities in large language models. We present a comprehensive framework that provides a unified view of these three phenomena, focusing on the competition between memorization and generalization circuits.
arXiv Detail & Related papers (2024-02-23T08:14:36Z)
Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision [25.449397570387802]
We propose an unsupervised method for discovering distributed representations of concepts by selecting a principal subset of neurons. Our empirical findings demonstrate that instances with similar neuron activation states tend to share coherent concepts. It can be utilized to identify unlabeled subclasses within data and to detect the causes of misclassifications.
arXiv Detail & Related papers (2023-12-28T07:33:51Z)
An effective theory of collective deep learning [1.3812010983144802]
We introduce a minimal model that condenses several recent decentralized algorithms. We derive an effective theory for linear networks to show that the coarse-grained behavior of our system is equivalent to a deformed Ginzburg-Landau model. We validate the theory in coupled ensembles of realistic neural networks trained on the MNIST dataset.
arXiv Detail & Related papers (2023-10-19T14:58:20Z)
Deep Clustering: A Comprehensive Survey [53.387957674512585]
Clustering analysis plays an indispensable role in machine learning and data mining. Deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks. Existing surveys for deep clustering mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering.
arXiv Detail & Related papers (2022-10-09T02:31:32Z)
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions [48.97008907275482]
Clustering is a fundamental machine learning task which has been widely studied in the literature. Deep Clustering, i.e., jointly optimizing the representation learning and clustering, has been proposed and hence attracted growing attention in the community. We summarize the essential components of deep clustering and categorize existing methods by the ways they design interactions between deep representation learning and clustering.
arXiv Detail & Related papers (2022-06-15T15:05:13Z)
DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach. It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks. Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Deep Clustering by Semantic Contrastive Learning [67.28140787010447]
We introduce a novel variant called Semantic Contrastive Learning (SCL) It explores the characteristics of both conventional contrastive learning and deep clustering. It can amplify the strengths of contrastive learning and deep clustering in a unified approach.
arXiv Detail & Related papers (2021-03-03T20:20:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.