Joint Characterization of Multiscale Information in High Dimensional
Data
- URL: http://arxiv.org/abs/2102.09669v1
- Date: Thu, 18 Feb 2021 23:33:00 GMT
- Title: Joint Characterization of Multiscale Information in High Dimensional
Data
- Authors: Daniel Sousa, Christopher Small
- Abstract summary: We propose a multiscale joint characterization approach designed to exploit synergies between global and local approaches to dimensionality reduction.
We show that joint characterization is capable of detecting and isolating signals which are not evident from either PCA or t-sne alone.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High dimensional data can contain multiple scales of variance. Analysis tools
that preferentially operate at one scale can be ineffective at capturing all
the information present in this cross-scale complexity. We propose a multiscale
joint characterization approach designed to exploit synergies between global
and local approaches to dimensionality reduction. We illustrate this approach
using Principal Components Analysis (PCA) to characterize global variance
structure and t-stochastic neighbor embedding (t-sne) to characterize local
variance structure. Using both synthetic images and real-world imaging
spectroscopy data, we show that joint characterization is capable of detecting
and isolating signals which are not evident from either PCA or t-sne alone.
Broadly, t-sne is effective at rendering a randomly oriented low-dimensional
map of local clusters, and PCA renders this map interpretable by providing
global, physically meaningful structure. This approach is illustrated using
imaging spectroscopy data, and may prove particularly useful for other
geospatial data given robust local variance structure due to spatial
autocorrelation and physical interpretability of global variance structure due
to spectral properties of Earth surface materials. However, the fundamental
premise could easily be extended to other high dimensional datasets, including
image time series and non-image data.
Related papers
- Entropic Optimal Transport Eigenmaps for Nonlinear Alignment and Joint Embedding of High-Dimensional Datasets [11.105392318582677]
We propose a principled approach for aligning and jointly embedding a pair of datasets with theoretical guarantees.
Our approach leverages the leading singular vectors of the EOT plan matrix between two datasets to extract their shared underlying structure.
We show that in a high-dimensional regime, the EOT plan recovers the shared manifold structure by approximating a kernel function evaluated at the locations of the latent variables.
arXiv Detail & Related papers (2024-07-01T18:48:55Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Datacube segmentation via Deep Spectral Clustering [76.48544221010424]
Extended Vision techniques often pose a challenge in their interpretation.
The huge dimensionality of data cube spectra poses a complex task in its statistical interpretation.
In this paper, we explore the possibility of applying unsupervised clustering methods in encoded space.
A statistical dimensional reduction is performed by an ad hoc trained (Variational) AutoEncoder, while the clustering process is performed by a (learnable) iterative K-Means clustering algorithm.
arXiv Detail & Related papers (2024-01-31T09:31:28Z) - Learning transformer-based heterogeneously salient graph representation for multimodal remote sensing image classification [42.15709954199397]
A transformer-based heterogeneously salient graph representation (THSGR) approach is proposed in this paper.
First, a multimodal heterogeneous graph encoder is presented to encode distinctively non-Euclidean structural features from heterogeneous data.
A self-attention-free multi-convolutional modulator is designed for effective and efficient long-term dependency modeling.
arXiv Detail & Related papers (2023-11-17T04:06:20Z) - Locality-preserving Directions for Interpreting the Latent Space of
Satellite Image GANs [20.010911311234718]
We present a locality-aware method for interpreting the latent space of wavelet-based Generative Adversarial Networks (GANs)
By focusing on preserving locality, the proposed method is able to decompose the weight-space of pre-trained GANs and recover interpretable directions.
arXiv Detail & Related papers (2023-09-26T12:29:36Z) - Preserving local densities in low-dimensional embeddings [37.278617643507815]
State-of-the-art methods, such as tSNE and UMAP, excel in unveiling local structures hidden in high-dimensional data.
We show, however, that these methods fail to reconstruct local properties, such as relative differences in densities.
We suggest dtSNE, which approximately conserves local densities.
arXiv Detail & Related papers (2023-01-31T16:11:54Z) - Incorporating Texture Information into Dimensionality Reduction for
High-Dimensional Images [65.74185962364211]
We present a method for incorporating neighborhood information into distance-based dimensionality reduction methods.
Based on a classification of different methods for comparing image patches, we explore a number of different approaches.
arXiv Detail & Related papers (2022-02-18T13:17:43Z) - Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet
Transmission Spectra [68.8204255655161]
We focus on unsupervised techniques for analyzing spectral data from transiting exoplanets.
We show that there is a high degree of correlation in the spectral data, which calls for appropriate low-dimensional representations.
We uncover interesting structures in the principal component basis, namely, well-defined branches corresponding to different chemical regimes.
arXiv Detail & Related papers (2022-01-07T22:26:33Z) - HyperPCA: a Powerful Tool to Extract Elemental Maps from Noisy Data
Obtained in LIBS Mapping of Materials [7.648784748888189]
We introduce HyperPCA, a new analysis tool for hyperspectral images based on a sparse representation of the data.
We show that the method presents advantages both in quantity and quality of the information recovered, thus improving the physico-chemical characterisation of analysed surfaces.
arXiv Detail & Related papers (2021-11-30T07:52:44Z) - Spectral-Spatial Global Graph Reasoning for Hyperspectral Image
Classification [50.899576891296235]
Convolutional neural networks have been widely applied to hyperspectral image classification.
Recent methods attempt to address this issue by performing graph convolutions on spatial topologies.
arXiv Detail & Related papers (2021-06-26T06:24:51Z) - Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
We present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimize a feature decomposition network and the target image classification model.
The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.
arXiv Detail & Related papers (2020-07-30T05:48:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.