Counterfactual Explanations for Clustering Models
- URL: http://arxiv.org/abs/2409.12632v1
- Date: Thu, 19 Sep 2024 10:05:58 GMT
- Title: Counterfactual Explanations for Clustering Models
- Authors: Aurora Spagnol, Kacper Sokol, Pietro Barbiero, Marc Langheinrich, Martin Gjoreski,
- Abstract summary: Clustering algorithms rely on complex optimisation processes that may be difficult to comprehend.
We propose a new, model-agnostic technique for explaining clustering algorithms with counterfactual statements.
- Score: 11.40145394568897
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clustering algorithms rely on complex optimisation processes that may be difficult to comprehend, especially for individuals who lack technical expertise. While many explainable artificial intelligence techniques exist for supervised machine learning, unsupervised learning -- and clustering in particular -- has been largely neglected. To complicate matters further, the notion of a ``true'' cluster is inherently challenging to define. These facets of unsupervised learning and its explainability make it difficult to foster trust in such methods and curtail their adoption. To address these challenges, we propose a new, model-agnostic technique for explaining clustering algorithms with counterfactual statements. Our approach relies on a novel soft-scoring method that captures the spatial information utilised by clustering models. It builds upon a state-of-the-art Bayesian counterfactual generator for supervised learning to deliver high-quality explanations. We evaluate its performance on five datasets and two clustering algorithms, and demonstrate that introducing soft scores to guide counterfactual search significantly improves the results.
Related papers
- GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Unsupervised Temporal Action Localization via Self-paced Incremental
Learning [57.55765505856969]
We present a novel self-paced incremental learning model to enhance clustering and localization training simultaneously.
We design two (constant- and variable- speed) incremental instance learning strategies for easy-to-hard model training, thus ensuring the reliability of these video pseudolabels.
arXiv Detail & Related papers (2023-12-12T16:00:55Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Seeking the Truth Beyond the Data. An Unsupervised Machine Learning
Approach [0.0]
Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together.
This article provides a deep description of the most widely used clustering methodologies.
It emphasizes the comparison of these algorithms' clustering efficiency based on 3 datasets.
arXiv Detail & Related papers (2022-07-14T14:22:36Z) - A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and
Future Directions [48.97008907275482]
Clustering is a fundamental machine learning task which has been widely studied in the literature.
Deep Clustering, i.e., jointly optimizing the representation learning and clustering, has been proposed and hence attracted growing attention in the community.
We summarize the essential components of deep clustering and categorize existing methods by the ways they design interactions between deep representation learning and clustering.
arXiv Detail & Related papers (2022-06-15T15:05:13Z) - KnAC: an approach for enhancing cluster analysis with background
knowledge and explanations [0.20999222360659603]
We present Knowledge Augmented Clustering (KnAC), which main goal is to confront expert-based labelling with automated clustering.
KnAC can serve as an augmentation of an arbitrary clustering algorithm, making the approach robust and model-agnostic.
arXiv Detail & Related papers (2021-12-16T10:13:47Z) - Learning the Precise Feature for Cluster Assignment [39.320210567860485]
We propose a framework which integrates representation learning and clustering into a single pipeline for the first time.
The proposed framework exploits the powerful ability of recently developed generative models for learning intrinsic features.
Experimental results show that the performance of the proposed method is superior, or at least comparable to, the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-11T04:08:54Z) - Algorithm-Agnostic Explainability for Unsupervised Clustering [19.375627480270627]
We present two novel algorithm-agnostic explainability methods, global permutation percent change (G2PC) feature importance and local perturbation percent change (L2PC) feature importance.
We demonstrate the utility of the methods for explaining five popular clustering algorithms on low-dimensional, ground-truth synthetic datasets.
arXiv Detail & Related papers (2021-05-17T17:58:55Z) - Deep Clustering by Semantic Contrastive Learning [67.28140787010447]
We introduce a novel variant called Semantic Contrastive Learning (SCL)
It explores the characteristics of both conventional contrastive learning and deep clustering.
It can amplify the strengths of contrastive learning and deep clustering in a unified approach.
arXiv Detail & Related papers (2021-03-03T20:20:48Z) - Provably Efficient Exploration for Reinforcement Learning Using
Unsupervised Learning [96.78504087416654]
Motivated by the prevailing paradigm of using unsupervised learning for efficient exploration in reinforcement learning (RL) problems, we investigate when this paradigm is provably efficient.
We present a general algorithmic framework that is built upon two components: an unsupervised learning algorithm and a noregret tabular RL algorithm.
arXiv Detail & Related papers (2020-03-15T19:23:59Z) - EXPLAIN-IT: Towards Explainable AI for Unsupervised Network Traffic
Analysis [7.447122949368314]
We introduce EXPLAIN-IT, a methodology which deals with unlabeled data, creates meaningful clusters, and suggests an explanation to the clustering results for the end-user.
We apply EXPLAIN-IT to the problem of YouTube video quality classification under encrypted traffic scenarios, showing promising results.
arXiv Detail & Related papers (2020-03-03T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.