A Framework for Deep Constrained Clustering
- URL: http://arxiv.org/abs/2101.02792v1
- Date: Thu, 7 Jan 2021 22:49:06 GMT
- Title: A Framework for Deep Constrained Clustering
- Authors: Hongjing Zhang, Tianyang Zhan, Sugato Basu, Ian Davidson
- Abstract summary: Constrained clustering formulations exist for popular algorithms such as k-means, mixture models, and spectral clustering but have several limitations.
Here we explore a deep learning framework for constrained clustering and in particular explore how it can extend the field of constrained clustering.
We show that our framework can not only handle standard together/apart constraints (without the well documented negative effects reported earlier) generated from labeled side information.
We propose an efficient training paradigm that is generally applicable to these four types of constraints.
- Score: 19.07636653413663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The area of constrained clustering has been extensively explored by
researchers and used by practitioners. Constrained clustering formulations
exist for popular algorithms such as k-means, mixture models, and spectral
clustering but have several limitations. A fundamental strength of deep
learning is its flexibility, and here we explore a deep learning framework for
constrained clustering and in particular explore how it can extend the field of
constrained clustering. We show that our framework can not only handle standard
together/apart constraints (without the well documented negative effects
reported earlier) generated from labeled side information but more complex
constraints generated from new types of side information such as continuous
values and high-level domain knowledge. Furthermore, we propose an efficient
training paradigm that is generally applicable to these four types of
constraints. We validate the effectiveness of our approach by empirical results
on both image and text datasets. We also study the robustness of our framework
when learning with noisy constraints and show how different components of our
framework contribute to the final performance. Our source code is available at
$\href{https://github.com/blueocean92/deep_constrained_clustering}{\text{URL}}$.
Related papers
- Spectral Clustering in Convex and Constrained Settings [0.0]
We introduce a novel framework for seamlessly integrating pairwise constraints into semidefinite spectral clustering.
Our methodology systematically extends the capabilities of semidefinite spectral clustering to capture complex data structures.
arXiv Detail & Related papers (2024-04-03T18:50:14Z) - Contrastive Continual Multi-view Clustering with Filtered Structural
Fusion [57.193645780552565]
Multi-view clustering thrives in applications where views are collected in advance.
It overlooks scenarios where data views are collected sequentially, i.e., real-time data.
Some methods are proposed to handle it but are trapped in a stability-plasticity dilemma.
We propose Contrastive Continual Multi-view Clustering with Filtered Structural Fusion.
arXiv Detail & Related papers (2023-09-26T14:18:29Z) - DivClust: Controlling Diversity in Deep Clustering [47.85350249697335]
DivClust produces consensus clustering solutions that consistently outperform single-clustering baselines.
Our method effectively controls diversity across frameworks and datasets with very small additional computational cost.
arXiv Detail & Related papers (2023-04-03T14:45:43Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Deep Clustering with a Constraint for Topological Invariance based on
Symmetric InfoNCE [10.912082501425944]
In this scenario, few existing state-of-the-art deep clustering methods can perform well for both non-complex topology and complex topology datasets.
We propose a constraint utilizing symmetric InfoNCE, which helps an objective of deep clustering method in the scenario train the model.
To confirm the effectiveness of the proposed constraint, we introduce a deep clustering method named MIST, which is a combination of an existing deep clustering method and our constraint.
arXiv Detail & Related papers (2023-03-06T11:05:21Z) - Semi-Supervised Constrained Clustering: An In-Depth Overview, Ranked
Taxonomy and Future Research Directions [2.5957372084704238]
The research area of constrained clustering has grown significantly over the years.
No unifying overview is available to easily understand the wide variety of available methods, constraints and benchmarks.
This study presents in-detail the background of constrained clustering and provides a novel ranked taxonomy of the types of constraints that can be used in constrained clustering.
arXiv Detail & Related papers (2023-02-28T17:46:31Z) - Deep Clustering: A Comprehensive Survey [53.387957674512585]
Clustering analysis plays an indispensable role in machine learning and data mining.
Deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks.
Existing surveys for deep clustering mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering.
arXiv Detail & Related papers (2022-10-09T02:31:32Z) - Deep Conditional Gaussian Mixture Model for Constrained Clustering [7.070883800886882]
Constrained clustering can leverage prior information on a growing amount of only partially labeled data.
We propose a novel framework for constrained clustering that is intuitive, interpretable, and can be trained efficiently in the framework of gradient variational inference.
arXiv Detail & Related papers (2021-06-11T13:38:09Z) - Fairness, Semi-Supervised Learning, and More: A General Framework for
Clustering with Stochastic Pairwise Constraints [32.19047459493177]
We introduce a novel family of emphstochastic pairwise constraints, which we incorporate into several essential clustering objectives.
We show that these constraints can succinctly model an intriguing collection of applications, including emphIndividual Fairness in clustering and emphMust-link constraints in semi-supervised learning.
arXiv Detail & Related papers (2021-03-02T20:27:58Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - An Integer Linear Programming Framework for Mining Constraints from Data [81.60135973848125]
We present a general framework for mining constraints from data.
In particular, we consider the inference in structured output prediction as an integer linear programming (ILP) problem.
We show that our approach can learn to solve 9x9 Sudoku puzzles and minimal spanning tree problems from examples without providing the underlying rules.
arXiv Detail & Related papers (2020-06-18T20:09:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.