Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension
- URL: http://arxiv.org/abs/2406.04421v1
- Date: Thu, 6 Jun 2024 18:06:50 GMT
- Title: Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension
- Authors: Shuang Ni, Adrien Aumon, Guy Wolf, Kevin R. Moon, Jake S. Rhodes,
- Abstract summary: The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels.
Common dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set.
We provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE.
- Score: 10.56452144281148
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Common dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE, combining information learned from the random forest model with the function-learning capabilities of autoencoders. Through quantitative assessment of various autoencoder architectures, we identify that networks that reconstruct random forest proximities are more robust for the embedding extension problem. Furthermore, by leveraging proximity-based prototypes, we achieve a 40% reduction in training time without compromising extension quality. Our method does not require label information for out-of-sample points, thus serving as a semi-supervised method, and can achieve consistent quality using only 10% of the training data.
Related papers
- Random Forest Autoencoders for Guided Representation Learning [9.97014316910538]
We introduce RF-PHATE, a neural-based framework for out-of-sample extension.
RF-PHATE's kernel-based extension outperforms existing methods, including RFPHATE's standard extension.
RF-AE is robust to the choice of kernels and generalizes to any kernel-based dimensionality reduction method.
arXiv Detail & Related papers (2025-02-18T20:02:29Z) - Probing the Purview of Neural Networks via Gradient Analysis [13.800680101300756]
We analyze the data-dependent capacity of neural networks and assess anomalies in inputs from the perspective of networks during inference.
To probe the purview of a network, we utilize gradients to measure the amount of change required for the model to characterize the given inputs more accurately.
We demonstrate that our gradient-based approach can effectively differentiate inputs that cannot be accurately represented with learned features.
arXiv Detail & Related papers (2023-04-06T03:02:05Z) - Semi-Supervised Manifold Learning with Complexity Decoupled Chart Autoencoders [45.29194877564103]
This work introduces a chart autoencoder with an asymmetric encoding-decoding process that can incorporate additional semi-supervised information such as class labels.
We discuss the approximation power of such networks and derive a bound that essentially depends on the intrinsic dimension of the data manifold rather than the dimension of ambient space.
arXiv Detail & Related papers (2022-08-22T19:58:03Z) - Multi-scale and Cross-scale Contrastive Learning for Semantic
Segmentation [5.281694565226513]
We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks.
By first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint.
arXiv Detail & Related papers (2022-03-25T01:24:24Z) - Point Set Self-Embedding [63.23565826873297]
This work presents an innovative method for point set self-embedding, that encodes structural information of a dense point set into its sparser version in a visual but imperceptible form.
The self-embedded point set can function as the ordinary downsampled one and be visualized efficiently on mobile devices.
We can leverage the self-embedded information to fully restore the original point set for detailed analysis on remote servers.
arXiv Detail & Related papers (2022-02-28T07:03:33Z) - SCARF: Self-Supervised Contrastive Learning using Random Feature
Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features.
We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - Stochastic Mutual Information Gradient Estimation for Dimensionality
Reduction Networks [11.634729459989996]
We introduce emerging information theoretic feature transformation protocols as an end-to-end neural network training approach.
We present a dimensionality reduction network (MMINet) training procedure based on the estimate of the mutual information gradient.
We experimentally evaluate our method with applications to high-dimensional biological data sets, and relate it to conventional feature selection algorithms.
arXiv Detail & Related papers (2021-05-01T08:20:04Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z) - SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine
Reconstruction with Self-Projection Optimization [52.20602782690776]
It is expensive and tedious to obtain large scale paired sparse-canned point sets for training from real scanned sparse data.
We propose a self-supervised point cloud upsampling network, named SPU-Net, to capture the inherent upsampling patterns of points lying on the underlying object surface.
We conduct various experiments on both synthetic and real-scanned datasets, and the results demonstrate that we achieve comparable performance to the state-of-the-art supervised methods.
arXiv Detail & Related papers (2020-12-08T14:14:09Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.