Towards understanding deep learning with the natural clustering prior
- URL: http://arxiv.org/abs/2203.08174v1
- Date: Tue, 15 Mar 2022 18:07:37 GMT
- Title: Towards understanding deep learning with the natural clustering prior
- Authors: Simon Carbonnelle
- Abstract summary: This thesis investigates the implicit integration of a natural clustering prior composed of three statements.
The decomposition of classes into multiple clusters implies that supervised deep learning systems could benefit from unsupervised clustering to define appropriate decision boundaries.
We do so through an extensive empirical study of the training dynamics as well as the neuron- and layer-level representations of deep neural networks.
- Score: 3.8073142980733
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The prior knowledge (a.k.a. priors) integrated into the design of a machine
learning system strongly influences its generalization abilities. In the
specific context of deep learning, some of these priors are poorly understood
as they implicitly emerge from the successful heuristics and tentative
approximations of biological brains involved in deep learning design. Through
the lens of supervised image classification problems, this thesis investigates
the implicit integration of a natural clustering prior composed of three
statements: (i) natural images exhibit a rich clustered structure, (ii) image
classes are composed of multiple clusters and (iii) each cluster contains
examples from a single class. The decomposition of classes into multiple
clusters implies that supervised deep learning systems could benefit from
unsupervised clustering to define appropriate decision boundaries. Hence, this
thesis attempts to identify implicit clustering abilities, mechanisms and
hyperparameters in deep learning systems and evaluate their relevance for
explaining the generalization abilities of these systems. We do so through an
extensive empirical study of the training dynamics as well as the neuron- and
layer-level representations of deep neural networks. The resulting collection
of experiments provides preliminary evidence for the relevance of the natural
clustering prior for understanding deep learning.
Related papers
- Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy [44.99833362998488]
We argue that the learning process in classification problems can be understood through the lens of label clustering.
Specifically, we observe that networks tend to distinguish higher-level (hypernym) categories in the early stages of training.
We introduce a novel framework to track the evolution of the feature manifold during training, revealing how the hierarchy of class relations emerges.
arXiv Detail & Related papers (2025-02-17T18:47:01Z) - Unified View of Grokking, Double Descent and Emergent Abilities: A
Perspective from Circuits Competition [83.13280812128411]
Recent studies have uncovered intriguing phenomena in deep learning, such as grokking, double descent, and emergent abilities in large language models.
We present a comprehensive framework that provides a unified view of these three phenomena, focusing on the competition between memorization and generalization circuits.
arXiv Detail & Related papers (2024-02-23T08:14:36Z) - Understanding Distributed Representations of Concepts in Deep Neural
Networks without Supervision [25.449397570387802]
We propose an unsupervised method for discovering distributed representations of concepts by selecting a principal subset of neurons.
Our empirical findings demonstrate that instances with similar neuron activation states tend to share coherent concepts.
It can be utilized to identify unlabeled subclasses within data and to detect the causes of misclassifications.
arXiv Detail & Related papers (2023-12-28T07:33:51Z) - An effective theory of collective deep learning [1.3812010983144802]
We introduce a minimal model that condenses several recent decentralized algorithms.
We derive an effective theory for linear networks to show that the coarse-grained behavior of our system is equivalent to a deformed Ginzburg-Landau model.
We validate the theory in coupled ensembles of realistic neural networks trained on the MNIST dataset.
arXiv Detail & Related papers (2023-10-19T14:58:20Z) - Deep Clustering: A Comprehensive Survey [53.387957674512585]
Clustering analysis plays an indispensable role in machine learning and data mining.
Deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks.
Existing surveys for deep clustering mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering.
arXiv Detail & Related papers (2022-10-09T02:31:32Z) - A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and
Future Directions [48.97008907275482]
Clustering is a fundamental machine learning task which has been widely studied in the literature.
Deep Clustering, i.e., jointly optimizing the representation learning and clustering, has been proposed and hence attracted growing attention in the community.
We summarize the essential components of deep clustering and categorize existing methods by the ways they design interactions between deep representation learning and clustering.
arXiv Detail & Related papers (2022-06-15T15:05:13Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Deep Clustering by Semantic Contrastive Learning [67.28140787010447]
We introduce a novel variant called Semantic Contrastive Learning (SCL)
It explores the characteristics of both conventional contrastive learning and deep clustering.
It can amplify the strengths of contrastive learning and deep clustering in a unified approach.
arXiv Detail & Related papers (2021-03-03T20:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.