Deep Image Clustering with Contrastive Learning and Multi-scale Graph
Convolutional Networks
- URL: http://arxiv.org/abs/2207.07173v3
- Date: Tue, 17 Oct 2023 14:52:13 GMT
- Title: Deep Image Clustering with Contrastive Learning and Multi-scale Graph
Convolutional Networks
- Authors: Yuankun Xu, Dong Huang, Chang-Dong Wang, Jian-Huang Lai
- Abstract summary: This paper presents a new deep clustering approach termed image clustering with contrastive learning and multi-scale graph convolutional networks (IcicleGCN)
Experiments on multiple image datasets demonstrate the superior clustering performance of IcicleGCN over the state-of-the-art.
- Score: 58.868899595936476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep clustering has shown its promising capability in joint representation
learning and clustering via deep neural networks. Despite the significant
progress, the existing deep clustering works mostly utilize some
distribution-based clustering loss, lacking the ability to unify representation
learning and multi-scale structure learning. To address this, this paper
presents a new deep clustering approach termed image clustering with
contrastive learning and multi-scale graph convolutional networks (IcicleGCN),
which bridges the gap between convolutional neural network (CNN) and graph
convolutional network (GCN) as well as the gap between contrastive learning and
multi-scale structure learning for the deep clustering task. Our framework
consists of four main modules, namely, the CNN-based backbone, the Instance
Similarity Module (ISM), the Joint Cluster Structure Learning and Instance
reconstruction Module (JC-SLIM), and the Multi-scale GCN module (M-GCN).
Specifically, the backbone network with two weight-sharing views is utilized to
learn the representations for the two augmented samples (from each image). The
learned representations are then fed to ISM and JC-SLIM for joint
instance-level and cluster-level contrastive learning, respectively, during
which an auto-encoder in JC-SLIM is also pretrained to serve as a bridge to the
M-GCN module. Further, to enforce multi-scale neighborhood structure learning,
two streams of GCNs and the auto-encoder are simultaneously trained via (i) the
layer-wise interaction with representation fusion and (ii) the joint
self-adaptive learning. Experiments on multiple image datasets demonstrate the
superior clustering performance of IcicleGCN over the state-of-the-art. The
code is available at https://github.com/xuyuankun631/IcicleGCN.
Related papers
- Deep Dependency Networks for Multi-Label Classification [24.24496964886951]
We show that the performance of previous approaches that combine Markov Random Fields with neural networks can be modestly improved.
We propose a new modeling framework called deep dependency networks, which augments a dependency network.
Despite its simplicity, jointly learning this new architecture yields significant improvements in performance.
arXiv Detail & Related papers (2023-02-01T17:52:40Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - Multi-level Second-order Few-shot Learning [111.0648869396828]
We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition.
We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction.
We demonstrate respectable results on standard datasets such as Omniglot, mini-ImageNet, tiered-ImageNet, Open MIC, fine-grained datasets such as CUB Birds, Stanford Dogs and Cars, and action recognition datasets such as HMDB51, UCF101, and mini-MIT.
arXiv Detail & Related papers (2022-01-15T19:49:00Z) - Attention-driven Graph Clustering Network [49.040136530379094]
We propose a novel deep clustering method named Attention-driven Graph Clustering Network (AGCN)
AGCN exploits a heterogeneous-wise fusion module to dynamically fuse the node attribute feature and the topological graph feature.
AGCN can jointly perform feature learning and cluster assignment in an unsupervised fashion.
arXiv Detail & Related papers (2021-08-12T02:30:38Z) - Learning Hierarchical Graph Neural Networks for Image Clustering [81.5841862489509]
We propose a hierarchical graph neural network (GNN) model that learns how to cluster a set of images into an unknown number of identities.
Our hierarchical GNN uses a novel approach to merge connected components predicted at each level of the hierarchy to form a new graph at the next level.
arXiv Detail & Related papers (2021-07-03T01:28:42Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Sparse Coding Driven Deep Decision Tree Ensembles for Nuclear
Segmentation in Digital Pathology Images [15.236873250912062]
We propose an easily trained yet powerful representation learning approach with performance highly competitive to deep neural networks in a digital pathology image segmentation task.
The method, called sparse coding driven deep decision tree ensembles that we abbreviate as ScD2TE, provides a new perspective on representation learning.
arXiv Detail & Related papers (2020-08-13T02:59:31Z) - Structural Deep Clustering Network [45.370272344031285]
We propose a Structural Deep Clustering Network (SDCN) to integrate the structural information into deep clustering.
Specifically, we design a delivery operator to transfer the representations learned by autoencoder to the corresponding GCN layer.
In this way, the multiple structures of data, from low-order to high-order, are naturally combined with the multiple representations learned by autoencoder.
arXiv Detail & Related papers (2020-02-05T04:33:40Z) - CSNNs: Unsupervised, Backpropagation-free Convolutional Neural Networks
for Representation Learning [0.0]
This work combines Convolutional Neural Networks (CNNs), clustering via Self-Organizing Maps (SOMs) and Hebbian Learning to propose the building blocks of Convolutional Self-Organizing Neural Networks (CSNNs)
Our approach replaces the learning of traditional convolutional layers from CNNs with the competitive learning procedure of SOMs and simultaneously learns local masks between those layers with separate Hebbian-like learning rules to overcome the problem of disentangling factors of variation when filters are learned through clustering.
arXiv Detail & Related papers (2020-01-28T14:57:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.