Interactive Dimensionality Reduction for Comparative Analysis
- URL: http://arxiv.org/abs/2106.15481v1
- Date: Tue, 29 Jun 2021 15:05:36 GMT
- Title: Interactive Dimensionality Reduction for Comparative Analysis
- Authors: Takanori Fujiwara, Xinhai Wei, Jian Zhao, Kwan-Liu Ma
- Abstract summary: We introduce an interactive DR framework where we integrate our new DR method, called ULCA, with an interactive visual interface.
ULCA unifies two DR schemes, discriminant analysis and contrastive learning, to support various comparative analysis tasks.
We develop an optimization algorithm that enables analysts to interactively refine ULCA results.
- Score: 28.52130400665133
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finding the similarities and differences between two or more groups of
datasets is a fundamental analysis task. For high-dimensional data,
dimensionality reduction (DR) methods are often used to find the
characteristics of each group. However, existing DR methods provide limited
capability and flexibility for such comparative analysis as each method is
designed only for a narrow analysis target, such as identifying factors that
most differentiate groups. In this work, we introduce an interactive DR
framework where we integrate our new DR method, called ULCA (unified linear
comparative analysis), with an interactive visual interface. ULCA unifies two
DR schemes, discriminant analysis and contrastive learning, to support various
comparative analysis tasks. To provide flexibility for comparative analysis, we
develop an optimization algorithm that enables analysts to interactively refine
ULCA results. Additionally, we provide an interactive visualization interface
to examine ULCA results with a rich set of analysis libraries. We evaluate ULCA
and the optimization algorithm to show their efficiency as well as present
multiple case studies using real-world datasets to demonstrate the usefulness
of our framework.
Related papers
- Binary Code Similarity Detection via Graph Contrastive Learning on Intermediate Representations [52.34030226129628]
Binary Code Similarity Detection (BCSD) plays a crucial role in numerous fields, including vulnerability detection, malware analysis, and code reuse identification.
In this paper, we propose IRBinDiff, which mitigates compilation differences by leveraging LLVM-IR with higher-level semantic abstraction.
Our extensive experiments, conducted under varied compilation settings, demonstrate that IRBinDiff outperforms other leading BCSD methods in both One-to-one comparison and One-to-many search scenarios.
arXiv Detail & Related papers (2024-10-24T09:09:20Z) - RepMatch: Quantifying Cross-Instance Similarities in Representation Space [15.215985417763472]
We introduce RepMatch, a novel method that characterizes data through the lens of similarity.
RepMatch quantifies the similarity between subsets of training instances by comparing the knowledge encoded in models trained on them.
We validate the effectiveness of RepMatch across multiple NLP tasks, datasets, and models.
arXiv Detail & Related papers (2024-10-12T20:42:28Z) - reAnalyst: Scalable Analysis of Reverse Engineering Activities [3.0083213208912865]
reAnalyst is a scalable analysis framework designed to facilitate the study of reverse engineering (RE) practices.
By integrating tool-agnostic data collection of screenshots, keystrokes, active processes, reAnalyst aims to overcome the limitations of traditional RE studies.
arXiv Detail & Related papers (2024-06-06T18:14:14Z) - A structured regression approach for evaluating model performance across intersectional subgroups [53.91682617836498]
Disaggregated evaluation is a central task in AI fairness assessment, where the goal is to measure an AI system's performance across different subgroups.
We introduce a structured regression approach to disaggregated evaluation that we demonstrate can yield reliable system performance estimates even for very small subgroups.
arXiv Detail & Related papers (2024-01-26T14:21:45Z) - Enhancing Deep Learning Models through Tensorization: A Comprehensive
Survey and Framework [0.0]
This paper explores the steps involved in multidimensional data sources, various multiway analysis methods employed, and the benefits of these approaches.
A small example of Blind Source Separation (BSS) is presented comparing 2-dimensional algorithms and a multiway algorithm in Python.
Results indicate that multiway analysis is more expressive.
arXiv Detail & Related papers (2023-09-05T17:56:22Z) - A Comparative Visual Analytics Framework for Evaluating Evolutionary
Processes in Multi-objective Optimization [7.906582204901926]
We present a visual analytics framework that enables the exploration and comparison of evolutionary processes in EMO algorithms.
We demonstrate the effectiveness of our framework through case studies on benchmarking and real-world multi-objective optimization problems.
arXiv Detail & Related papers (2023-08-10T15:32:46Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Multi-task Learning of Order-Consistent Causal Graphs [59.9575145128345]
We consider the problem of discovering $K related Gaussian acyclic graphs (DAGs)
Under multi-task learning setting, we propose a $l_1/l$-regularized maximum likelihood estimator (MLE) for learning $K$ linear structural equation models.
We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order.
arXiv Detail & Related papers (2021-11-03T22:10:18Z) - Explaining dimensionality reduction results using Shapley values [0.0]
Dimensionality reduction (DR) techniques have been consistently supporting high-dimensional data analysis in various applications.
Current literature approaches designed to interpret DR techniques do not explain the features' contributions well since they focus only on the low-dimensional representation or do not consider the relationship among features.
This paper presents ClusterShapley to address these problems, using Shapley values to generate explanations of dimensionality reduction techniques and interpret these algorithms using a cluster-oriented analysis.
arXiv Detail & Related papers (2021-03-09T19:28:10Z) - Shared Space Transfer Learning for analyzing multi-site fMRI data [83.41324371491774]
Multi-voxel pattern analysis (MVPA) learns predictive models from task-based functional magnetic resonance imaging (fMRI) data.
MVPA works best with a well-designed feature set and an adequate sample size.
Most fMRI datasets are noisy, high-dimensional, expensive to collect, and with small sample sizes.
This paper proposes the Shared Space Transfer Learning (SSTL) as a novel transfer learning approach.
arXiv Detail & Related papers (2020-10-24T08:50:26Z) - Deep Representational Similarity Learning for analyzing neural
signatures in task-based fMRI dataset [81.02949933048332]
This paper develops Deep Representational Similarity Learning (DRSL), a deep extension of Representational Similarity Analysis (RSA)
DRSL is appropriate for analyzing similarities between various cognitive tasks in fMRI datasets with a large number of subjects.
arXiv Detail & Related papers (2020-09-28T18:30:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.