Clustering-Induced Generative Incomplete Image-Text Clustering (CIGIT-C)
- URL: http://arxiv.org/abs/2209.13763v1
- Date: Wed, 28 Sep 2022 01:19:52 GMT
- Title: Clustering-Induced Generative Incomplete Image-Text Clustering (CIGIT-C)
- Authors: Dongjin Guo, Xiaoming Su, Jiatai Wang, Limin Liu, Zhiyong Pei, Zhiwei
Xu
- Abstract summary: We propose a Clustering-Induced Generative Incomplete Image-Text Clustering(CIGIT-C) network to address the challenges above.
We first use modality-specific encoders to map original features to more distinctive subspaces.
The latent connections between intra and inter-modalities are thoroughly explored.
- Score: 3.2062075983668343
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The target of image-text clustering (ITC) is to find correct clusters by
integrating complementary and consistent information of multi-modalities for
these heterogeneous samples. However, the majority of current studies analyse
ITC on the ideal premise that the samples in every modality are complete. This
presumption, however, is not always valid in real-world situations. The missing
data issue degenerates the image-text feature learning performance and will
finally affect the generalization abilities in ITC tasks. Although a series of
methods have been proposed to address this incomplete image text clustering
issue (IITC), the following problems still exist: 1) most existing methods
hardly consider the distinct gap between heterogeneous feature domains. 2) For
missing data, the representations generated by existing methods are rarely
guaranteed to suit clustering tasks. 3) Existing methods do not tap into the
latent connections both inter and intra modalities. In this paper, we propose a
Clustering-Induced Generative Incomplete Image-Text Clustering(CIGIT-C) network
to address the challenges above. More specifically, we first use
modality-specific encoders to map original features to more distinctive
subspaces. The latent connections between intra and inter-modalities are
thoroughly explored by using the adversarial generating network to produce one
modality conditional on the other modality. Finally, we update the
corresponding modalityspecific encoders using two KL divergence losses.
Experiment results on public image-text datasets demonstrated that the
suggested method outperforms and is more effective in the IITC job.
Related papers
- CDIMC-net: Cognitive Deep Incomplete Multi-view Clustering Network [53.72046586512026]
We propose a novel incomplete multi-view clustering network, called Cognitive Deep Incomplete Multi-view Clustering Network (CDIMC-net)
It captures the high-level features and local structure of each view by incorporating the view-specific deep encoders and graph embedding strategy into a framework.
Based on the human cognition, i.e., learning from easy to hard, it introduces a self-paced strategy to select the most confident samples for model training.
arXiv Detail & Related papers (2024-03-28T15:45:03Z) - Stable Cluster Discrimination for Deep Clustering [7.175082696240088]
Deep clustering can optimize representations of instances (i.e., representation learning) and explore the inherent data distribution.
The coupled objective implies a trivial solution that all instances collapse to the uniform features.
In this work, we first show that the prevalent discrimination task in supervised learning is unstable for one-stage clustering.
A novel stable cluster discrimination (SeCu) task is proposed and a new hardness-aware clustering criterion can be obtained accordingly.
arXiv Detail & Related papers (2023-11-24T06:43:26Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Scalable Incomplete Multi-View Clustering with Structure Alignment [71.62781659121092]
In this paper, we propose a novel incomplete anchor graph learning framework.
We construct the view-specific anchor graph to capture the complementary information from different views.
The time and space complexity of the proposed SIMVC-SA is proven to be linearly correlated with the number of samples.
arXiv Detail & Related papers (2023-08-31T08:30:26Z) - Deep Multi-View Subspace Clustering with Anchor Graph [11.291831842959926]
We propose a novel deep multi-view subspace clustering method with anchor graph (DMCAG)
DMCAG learns the embedded features for each view independently, which are used to obtain the subspace representations.
Our method achieves superior clustering performance over other state-of-the-art methods.
arXiv Detail & Related papers (2023-05-11T16:17:43Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Self-supervised Image Clustering from Multiple Incomplete Views via
Constrastive Complementary Generation [5.314364096882052]
We propose Contrastive Incomplete Multi-View Image Clustering with Generative Adversarial Networks (CIMIC-GAN)
We incorporate autoencoding representation of complete and incomplete data into double contrastive learning to achieve learning consistency.
Experiments conducted on textcolorblackfour extensively-used datasets show that CIMIC-GAN outperforms state-of-the-art incomplete multi-View clustering methods.
arXiv Detail & Related papers (2022-09-24T05:08:34Z) - Adaptively-weighted Integral Space for Fast Multiview Clustering [54.177846260063966]
We propose an Adaptively-weighted Integral Space for Fast Multiview Clustering (AIMC) with nearly linear complexity.
Specifically, view generation models are designed to reconstruct the view observations from the latent integral space.
Experiments conducted on several realworld datasets confirm the superiority of the proposed AIMC method.
arXiv Detail & Related papers (2022-08-25T05:47:39Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Unsupervised Visual Representation Learning by Online Constrained
K-Means [44.38989920488318]
Cluster discrimination is an effective pretext task for unsupervised representation learning.
We propose a novel clustering-based pretext task with online textbfConstrained textbfK-mtextbfeans (textbfCoKe)
Our online assignment method has a theoretical guarantee to approach the global optimum.
arXiv Detail & Related papers (2021-05-24T20:38:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.