FATE: Feature-Agnostic Transformer-based Encoder for learning
generalized embedding spaces in flow cytometry data
- URL: http://arxiv.org/abs/2311.03314v1
- Date: Mon, 6 Nov 2023 18:06:38 GMT
- Title: FATE: Feature-Agnostic Transformer-based Encoder for learning
generalized embedding spaces in flow cytometry data
- Authors: Lisa Weijler, Florian Kowarsch, Michael Reiter, Pedro Hermosilla,
Margarita Maurer-Granofszky, Michael Dworzak
- Abstract summary: We aim at effectively leveraging data with varying features, without the need to constrain the input space to the intersection of potential feature sets.
We propose a novel architecture that can directly process data without the necessity of aligned feature modalities.
The advantages of the model are demonstrated for automatic cancer cell detection in acute myeloid leukemia in flow data.
- Score: 4.550634499956126
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While model architectures and training strategies have become more generic
and flexible with respect to different data modalities over the past years, a
persistent limitation lies in the assumption of fixed quantities and
arrangements of input features. This limitation becomes particularly relevant
in scenarios where the attributes captured during data acquisition vary across
different samples. In this work, we aim at effectively leveraging data with
varying features, without the need to constrain the input space to the
intersection of potential feature sets or to expand it to their union. We
propose a novel architecture that can directly process data without the
necessity of aligned feature modalities by learning a general embedding space
that captures the relationship between features across data samples with
varying sets of features. This is achieved via a set-transformer architecture
augmented by feature-encoder layers, thereby enabling the learning of a shared
latent feature space from data originating from heterogeneous feature spaces.
The advantages of the model are demonstrated for automatic cancer cell
detection in acute myeloid leukemia in flow cytometry data, where the features
measured during acquisition often vary between samples. Our proposed
architecture's capacity to operate seamlessly across incongruent feature spaces
is particularly relevant in this context, where data scarcity arises from the
low prevalence of the disease. The code is available for research purposes at
https://github.com/lisaweijler/FATE.
Related papers
- PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - An improved tabular data generator with VAE-GMM integration [9.4491536689161]
We propose a novel Variational Autoencoder (VAE)-based model that addresses limitations of current approaches.
Inspired by the TVAE model, our approach incorporates a Bayesian Gaussian Mixture model (BGM) within the VAE architecture.
We thoroughly validate our model on three real-world datasets with mixed data types, including two medically relevant ones.
arXiv Detail & Related papers (2024-04-12T12:31:06Z) - Targeted Analysis of High-Risk States Using an Oriented Variational
Autoencoder [3.494548275937873]
Variational autoencoder (VAE) neural networks can be trained to generate power system states.
The coordinates of the latent space codes of VAEs have been shown to correlate with conceptual features of the data.
In this paper, an oriented variation autoencoder (OVAE) is proposed to constrain the link between latent space code and generated data.
arXiv Detail & Related papers (2023-03-20T19:34:21Z) - Transfer Learning on Heterogeneous Feature Spaces for Treatment Effects
Estimation [103.55894890759376]
This paper introduces several building blocks that use representation learning to handle the heterogeneous feature spaces.
We show how these building blocks can be used to recover transfer learning equivalents of the standard CATE learners.
arXiv Detail & Related papers (2022-10-08T16:41:02Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - RENs: Relevance Encoding Networks [0.0]
This paper proposes relevance encoding networks (RENs): a novel probabilistic VAE-based framework that uses the automatic relevance determination (ARD) prior in the latent space to learn the data-specific bottleneck dimensionality.
We show that the proposed model learns the relevant latent bottleneck dimensionality without compromising the representation and generation quality of the samples.
arXiv Detail & Related papers (2022-05-25T21:53:48Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Encoding Domain Information with Sparse Priors for Inferring Explainable
Latent Variables [2.8935588665357077]
We propose spex-LVM, a factorial latent variable model with sparse priors to encourage the inference of explainable factors.
spex-LVM utilizes existing knowledge of curated biomedical pathways to automatically assign annotated attributes to latent factors.
Evaluations on simulated and real single-cell RNA-seq datasets demonstrate that our model robustly identifies relevant structure in an inherently explainable manner.
arXiv Detail & Related papers (2021-07-08T10:19:32Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - INSIDE: Steering Spatial Attention with Non-Imaging Information in CNNs [14.095546881696311]
We consider the problem of integrating non-imaging information into segmentation networks to improve performance.
We propose a mechanism to allow for spatial localisation conditioned on non-imaging information.
Our method can be trained end-to-end and does not require additional supervision.
arXiv Detail & Related papers (2020-08-21T13:32:05Z) - Attribute-based Regularization of Latent Spaces for Variational
Auto-Encoders [79.68916470119743]
We present a novel method to structure the latent space of a Variational Auto-Encoder (VAE) to encode different continuous-valued attributes explicitly.
This is accomplished by using an attribute regularization loss which enforces a monotonic relationship between the attribute values and the latent code of the dimension along which the attribute is to be encoded.
arXiv Detail & Related papers (2020-04-11T20:53:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.