Explaining dimensionality reduction results using Shapley values
- URL: http://arxiv.org/abs/2103.05678v1
- Date: Tue, 9 Mar 2021 19:28:10 GMT
- Title: Explaining dimensionality reduction results using Shapley values
- Authors: Wilson Est\'ecio Marc\'ilio J\'unior and Danilo Medeiros Eler
- Abstract summary: Dimensionality reduction (DR) techniques have been consistently supporting high-dimensional data analysis in various applications.
Current literature approaches designed to interpret DR techniques do not explain the features' contributions well since they focus only on the low-dimensional representation or do not consider the relationship among features.
This paper presents ClusterShapley to address these problems, using Shapley values to generate explanations of dimensionality reduction techniques and interpret these algorithms using a cluster-oriented analysis.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dimensionality reduction (DR) techniques have been consistently supporting
high-dimensional data analysis in various applications. Besides the patterns
uncovered by these techniques, the interpretation of DR results based on each
feature's contribution to the low-dimensional representation supports new finds
through exploratory analysis. Current literature approaches designed to
interpret DR techniques do not explain the features' contributions well since
they focus only on the low-dimensional representation or do not consider the
relationship among features. This paper presents ClusterShapley to address
these problems, using Shapley values to generate explanations of dimensionality
reduction techniques and interpret these algorithms using a cluster-oriented
analysis. ClusterShapley explains the formation of clusters and the meaning of
their relationship, which is useful for exploratory data analysis in various
domains. We propose novel visualization techniques to guide the interpretation
of features' contributions on clustering formation and validate our methodology
through case studies of publicly available datasets. The results demonstrate
our approach's interpretability and analysis power to generate insights about
pathologies and patients in different conditions using DR results.
Related papers
- Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - The Multiverse of Dynamic Mode Decomposition Algorithms [0.0]
Dynamic Mode Decomposition (DMD) is a popular data-driven analysis technique used to decompose complex, nonlinear systems into modes.
This review emphasizes the role of Koopman operators in transforming complex nonlinear dynamics into a linear framework.
arXiv Detail & Related papers (2023-11-30T19:00:50Z) - Enhancing Deep Learning Models through Tensorization: A Comprehensive
Survey and Framework [0.0]
This paper explores the steps involved in multidimensional data sources, various multiway analysis methods employed, and the benefits of these approaches.
A small example of Blind Source Separation (BSS) is presented comparing 2-dimensional algorithms and a multiway algorithm in Python.
Results indicate that multiway analysis is more expressive.
arXiv Detail & Related papers (2023-09-05T17:56:22Z) - Towards a mathematical understanding of learning from few examples with
nonlinear feature maps [68.8204255655161]
We consider the problem of data classification where the training set consists of just a few data points.
We reveal key relationships between the geometry of an AI model's feature space, the structure of the underlying data distributions, and the model's generalisation capabilities.
arXiv Detail & Related papers (2022-11-07T14:52:58Z) - An Experimental Study of Dimension Reduction Methods on Machine Learning
Algorithms with Applications to Psychometrics [77.34726150561087]
We show that dimension reduction can decrease, increase, or provide the same accuracy as no reduction of variables.
Our tentative results find that dimension reduction tends to lead to better performance when used for classification tasks.
arXiv Detail & Related papers (2022-10-19T22:07:13Z) - Interactive Dimensionality Reduction for Comparative Analysis [28.52130400665133]
We introduce an interactive DR framework where we integrate our new DR method, called ULCA, with an interactive visual interface.
ULCA unifies two DR schemes, discriminant analysis and contrastive learning, to support various comparative analysis tasks.
We develop an optimization algorithm that enables analysts to interactively refine ULCA results.
arXiv Detail & Related papers (2021-06-29T15:05:36Z) - HUMAP: Hierarchical Uniform Manifold Approximation and Projection [42.50219822975012]
This work presents HUMAP, a novel hierarchical dimensionality reduction technique designed to be flexible on preserving local and global structures.
We provide empirical evidence of our technique's superiority compared with current hierarchical approaches and show a case study applying HUMAP for dataset labelling.
arXiv Detail & Related papers (2021-06-14T19:27:54Z) - Transforming Feature Space to Interpret Machine Learning Models [91.62936410696409]
This contribution proposes a novel approach that interprets machine-learning models through the lens of feature space transformations.
It can be used to enhance unconditional as well as conditional post-hoc diagnostic tools.
A case study on remote-sensing landcover classification with 46 features is used to demonstrate the potential of the proposed approach.
arXiv Detail & Related papers (2021-04-09T10:48:11Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Contrastive analysis for scatter plot-based representations of
dimensionality reduction [0.0]
This paper introduces a methodology to explore multidimensional datasets and interpret clusters' formation.
We also introduce a bipartite graph to visually interpret and explore the relationship between the statistical variables used to understand how the attributes influenced cluster formation.
arXiv Detail & Related papers (2021-01-26T01:16:31Z) - A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques.
We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.