Random Forest Autoencoders for Guided Representation Learning
- URL: http://arxiv.org/abs/2502.13257v1
- Date: Tue, 18 Feb 2025 20:02:29 GMT
- Title: Random Forest Autoencoders for Guided Representation Learning
- Authors: Adrien Aumon, Shuang Ni, Myriam Lizotte, Guy Wolf, Kevin R. Moon, Jake S. Rhodes,
- Abstract summary: We introduce RF-PHATE, a neural-based framework for out-of-sample extension.
RF-PHATE's kernel-based extension outperforms existing methods, including RFPHATE's standard extension.
RF-AE is robust to the choice of kernels and generalizes to any kernel-based dimensionality reduction method.
- Score: 9.97014316910538
- License:
- Abstract: Decades of research have produced robust methods for unsupervised data visualization, yet supervised visualization$\unicode{x2013}$where expert labels guide representations$\unicode{x2013}$remains underexplored, as most supervised approaches prioritize classification over visualization. Recently, RF-PHATE, a diffusion-based manifold learning method leveraging random forests and information geometry, marked significant progress in supervised visualization. However, its lack of an explicit mapping function limits scalability and prevents application to unseen data, posing challenges for large datasets and label-scarce scenarios. To overcome these limitations, we introduce Random Forest Autoencoders (RF-AE), a neural network-based framework for out-of-sample kernel extension that combines the flexibility of autoencoders with the supervised learning strengths of random forests and the geometry captured by RF-PHATE. RF-AE enables efficient out-of-sample supervised visualization and outperforms existing methods, including RF-PHATE's standard kernel extension, in both accuracy and interpretability. Additionally, RF-AE is robust to the choice of hyper-parameters and generalizes to any kernel-based dimensionality reduction method.
Related papers
- Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection.
We design a forgery-style mixture formulation that augments the diversity of forgery source domains.
We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - Forward-Forward Learning achieves Highly Selective Latent Representations for Out-of-Distribution Detection in Fully Spiking Neural Networks [6.7236795813629]
Spiking Neural Networks (SNNs), inspired by biological systems, offer a promising avenue for overcoming limitations.
In this work, we explore the potential of the spiking Forward-Forward Algorithm (FFA) to address these challenges.
We propose a novel, gradient-free attribution method to detect features that drive a sample away from class distributions.
arXiv Detail & Related papers (2024-07-19T08:08:17Z) - Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension [10.56452144281148]
The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels.
Common dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set.
We provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE.
arXiv Detail & Related papers (2024-06-06T18:06:50Z) - Fus-MAE: A cross-attention-based data fusion approach for Masked
Autoencoders in remote sensing [5.990692497580643]
Fus-MAE is a self-supervised learning framework based on masked autoencoders.
Our empirical findings demonstrate that Fus-MAE can effectively compete with contrastive learning strategies tailored for SAR-optical data fusion.
arXiv Detail & Related papers (2024-01-05T11:36:21Z) - S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical Tokens [45.06704981913823]
Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face recognition system by presenting spoofed faces.
We propose a novel Statistical Adapter (S-Adapter) that gathers local discriminative and statistical information from localized token histograms.
To further improve the generalization of the statistical tokens, we propose a novel Token Style Regularization (TSR)
Our experimental results demonstrate that our proposed S-Adapter and TSR provide significant benefits in both zero-shot and few-shot cross-domain testing, outperforming state-of-the-art methods on several benchmark tests.
arXiv Detail & Related papers (2023-09-07T22:36:22Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Disentangled Representation Learning for RF Fingerprint Extraction under
Unknown Channel Statistics [77.13542705329328]
We propose a framework of disentangled representation learning(DRL) that first learns to factor the input signals into a device-relevant component and a device-irrelevant component via adversarial learning.
The implicit data augmentation in the proposed framework imposes a regularization on the RFF extractor to avoid the possible overfitting of device-irrelevant channel statistics.
Experiments validate that the proposed approach, referred to as DR-RFF, outperforms conventional methods in terms of generalizability to unknown complicated propagation environments.
arXiv Detail & Related papers (2022-08-04T15:46:48Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Random Partitioning Forest for Point-Wise and Collective Anomaly
Detection -- Application to Intrusion Detection [9.74672460306765]
DiFF-RF is an ensemble approach composed of random partitioning binary trees to detect anomalies.
Our experiments show that DiFF-RF almost systematically outperforms the isolation forest (IF) algorithm.
Our experience shows that DiFF-RF can work well in the presence of small-scale learning data.
arXiv Detail & Related papers (2020-06-29T10:44:08Z) - Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference [55.35176938713946]
We develop deep autoencoding topic model (DATM) that uses a hierarchy of gamma distributions to construct its multi-stochastic-layer generative network.
We propose a Weibull upward-downward variational encoder that deterministically propagates information upward via a deep neural network, followed by a downward generative model.
The efficacy and scalability of our models are demonstrated on both unsupervised and supervised learning tasks on big corpora.
arXiv Detail & Related papers (2020-06-15T22:22:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.