Memetic Differential Evolution Methods for Semi-Supervised Clustering
- URL: http://arxiv.org/abs/2403.04322v1
- Date: Thu, 7 Mar 2024 08:37:36 GMT
- Title: Memetic Differential Evolution Methods for Semi-Supervised Clustering
- Authors: Pierluigi Mansueto, Fabio Schoen
- Abstract summary: We deal with semi-supervised Minimum Sum-of-Squares Clustering (MSSC) problems where background knowledge is given in the form of instance-level constraints.
We propose a novel memetic strategy based on the Differential Evolution paradigm, directly extending a state-of-the-art framework recently proposed in the unsupervised clustering literature.
- Score: 1.0256438517258686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we deal with semi-supervised Minimum Sum-of-Squares Clustering
(MSSC) problems where background knowledge is given in the form of
instance-level constraints. In particular, we take into account "must-link" and
"cannot-link" constraints, each of which indicates if two dataset points should
be associated to the same or to a different cluster. The presence of such
constraints makes the problem at least as hard as its unsupervised version: it
is no more true that each point is associated to its nearest cluster center,
thus requiring some modifications in crucial operations, such as the assignment
step. In this scenario, we propose a novel memetic strategy based on the
Differential Evolution paradigm, directly extending a state-of-the-art
framework recently proposed in the unsupervised clustering literature. As far
as we know, our contribution represents the first attempt to define a memetic
methodology designed to generate a (hopefully) optimal feasible solution for
the semi-supervised MSSC problem. The proposal is compared with some
state-of-the-art algorithms from the literature on a set of well-known
datasets, highlighting its effectiveness and efficiency in finding good quality
clustering solutions.
Related papers
- A column generation algorithm with dynamic constraint aggregation for minimum sum-of-squares clustering [0.30693357740321775]
The minimum sum-of-squares clustering problem (MSSC) refers to the problem of partitioning $n$ data points into $k$ clusters.
We propose an efficient algorithm for solving large-scale MSSC instances, which combines column generation (CG) with dynamic constraint aggregation (DCA)
arXiv Detail & Related papers (2024-10-08T16:51:28Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality.
The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z) - An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering [0.5801044612920815]
We present a new branch-and-bound algorithm for semi-supervised MSSC.
Background knowledge is incorporated as pairwise must-link and cannot-link constraints.
For the first time, the proposed global optimization algorithm efficiently manages to solve real-world instances up to 800 data points.
arXiv Detail & Related papers (2021-11-30T17:08:53Z) - Clustering to the Fewest Clusters Under Intra-Cluster Dissimilarity
Constraints [0.0]
equiwide clustering relies neither on density nor on a predefined number of expected classes, but on a dissimilarity threshold.
We review and evaluate suitable clustering algorithms to identify trade-offs between the various practical solutions for this clustering problem.
arXiv Detail & Related papers (2021-09-28T12:02:18Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Unsupervised Multi-view Clustering by Squeezing Hybrid Knowledge from
Cross View and Each View [68.88732535086338]
This paper proposes a new multi-view clustering method, low-rank subspace multi-view clustering based on adaptive graph regularization.
Experimental results for five widely used multi-view benchmarks show that our proposed algorithm surpasses other state-of-the-art methods by a clear margin.
arXiv Detail & Related papers (2020-08-23T08:25:06Z) - A Classification-Based Approach to Semi-Supervised Clustering with
Pairwise Constraints [5.639904484784126]
We introduce a network framework for semi-supervised clustering with pairwise constraints.
In contrast to existing approaches, we decompose SSC into two simpler classification tasks/stages.
The proposed approach, S3C2, is motivated by the observation that binary classification is usually easier than multi-class clustering.
arXiv Detail & Related papers (2020-01-18T20:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.