Supervised Visualization for Data Exploration
- URL: http://arxiv.org/abs/2006.08701v1
- Date: Mon, 15 Jun 2020 19:10:17 GMT
- Title: Supervised Visualization for Data Exploration
- Authors: Jake S. Rhodes, Adele Cutler, Guy Wolf, Kevin R. Moon
- Abstract summary: We describe a novel supervised visualization technique based on random forest proximities and diffusion-based dimensionality reduction.
Our approach is robust to noise and parameter tuning, thus making it simple to use while producing reliable visualizations for data exploration.
- Score: 9.742277703732187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dimensionality reduction is often used as an initial step in data
exploration, either as preprocessing for classification or regression or for
visualization. Most dimensionality reduction techniques to date are
unsupervised; they do not take class labels into account (e.g., PCA, MDS,
t-SNE, Isomap). Such methods require large amounts of data and are often
sensitive to noise that may obfuscate important patterns in the data. Various
attempts at supervised dimensionality reduction methods that take into account
auxiliary annotations (e.g., class labels) have been successfully implemented
with goals of increased classification accuracy or improved data visualization.
Many of these supervised techniques incorporate labels in the loss function in
the form of similarity or dissimilarity matrices, thereby creating
over-emphasized separation between class clusters, which does not realistically
represent the local and global relationships in the data. In addition, these
approaches are often sensitive to parameter tuning, which may be difficult to
configure without an explicit quantitative notion of visual superiority. In
this paper, we describe a novel supervised visualization technique based on
random forest proximities and diffusion-based dimensionality reduction. We
show, both qualitatively and quantitatively, the advantages of our approach in
retaining local and global structures in data, while emphasizing important
variables in the low-dimensional embedding. Importantly, our approach is robust
to noise and parameter tuning, thus making it simple to use while producing
reliable visualizations for data exploration.
Related papers
- Enhancing Fine-Grained Visual Recognition in the Low-Data Regime Through Feature Magnitude Regularization [23.78498670529746]
We introduce a regularization technique to ensure that the magnitudes of the extracted features are evenly distributed.
Despite its apparent simplicity, our approach has demonstrated significant performance improvements across various fine-grained visual recognition datasets.
arXiv Detail & Related papers (2024-09-03T07:32:46Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - ShaRP: Shape-Regularized Multidimensional Projections [71.30697308446064]
We present a novel projection technique - ShaRP - that provides users explicit control over the visual signature of the created scatterplot.
ShaRP scales well with dimensionality and dataset size, and generically handles any quantitative dataset.
arXiv Detail & Related papers (2023-06-01T11:16:58Z) - Domain Adaptive Multiple Instance Learning for Instance-level Prediction
of Pathological Images [45.132775668689604]
We propose a new task setting to improve the classification performance of the target dataset without increasing annotation costs.
In order to combine the supervisory information of both methods effectively, we propose a method to create pseudo-labels with high confidence.
arXiv Detail & Related papers (2023-04-07T08:31:06Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection.
First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results.
We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z) - Overcoming the curse of dimensionality with Laplacian regularization in
semi-supervised learning [80.20302993614594]
We provide a statistical analysis to overcome drawbacks of Laplacian regularization.
We unveil a large body of spectral filtering methods that exhibit desirable behaviors.
We provide realistic computational guidelines in order to make our method usable with large amounts of data.
arXiv Detail & Related papers (2020-09-09T14:28:54Z) - Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning.
We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z) - Supervised Discriminative Sparse PCA with Adaptive Neighbors for
Dimensionality Reduction [47.1456603605763]
We propose a novel linear dimensionality reduction approach, supervised discriminative sparse PCA with adaptive neighbors (SDSPCAAN)
As a result, both global and local data structures, as well as the label information, are used for better dimensionality reduction.
arXiv Detail & Related papers (2020-01-09T17:02:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.