Structural Equation-VAE: Disentangled Latent Representations for Tabular Data
- URL: http://arxiv.org/abs/2508.06347v2
- Date: Sat, 16 Aug 2025 10:27:15 GMT
- Title: Structural Equation-VAE: Disentangled Latent Representations for Tabular Data
- Authors: Ruiyu Zhang, Ce Zhao, Xin Zhao, Lin Nie, Wai-Fung Lam,
- Abstract summary: We introduce SE-VAE (Structural Equation-Variational Autoencoder), a novel architecture that embeds measurement structure directly into the design of a variational autoencoder.<n>Inspired by structural equation modeling, SE-VAE aligns latent subspaces with known indicator groupings and introduces a global nuisance latent to isolate construct-specific confounding variation.<n> SE-VAE consistently outperforms alternatives in factor recovery, interpretability, and robustness to nuisance variation.
- Score: 4.101599614979332
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning interpretable latent representations from tabular data remains a challenge in deep generative modeling. We introduce SE-VAE (Structural Equation-Variational Autoencoder), a novel architecture that embeds measurement structure directly into the design of a variational autoencoder. Inspired by structural equation modeling, SE-VAE aligns latent subspaces with known indicator groupings and introduces a global nuisance latent to isolate construct-specific confounding variation. This modular architecture enables disentanglement through design rather than through statistical regularizers alone. We evaluate SE-VAE on a suite of simulated tabular datasets and benchmark its performance against a series of leading baselines using standard disentanglement metrics. SE-VAE consistently outperforms alternatives in factor recovery, interpretability, and robustness to nuisance variation. Ablation results reveal that architectural structure, rather than regularization strength, is the key driver of performance. SE-VAE offers a principled framework for white-box generative modeling in scientific and social domains where latent constructs are theory-driven and measurement validity is essential.
Related papers
- Table-BiEval: A Self-Supervised, Dual-Track Framework for Decoupling Structure and Content in LLM Evaluation [11.450834626205676]
Table-BiEval is a novel approach based on a human-free, self-supervised evaluation framework.<n>It calculates Content Semantic Accuracy and Normalized Tree Edit Distance to decouple structure from content.<n>Results reveal substantial variability, highlighting that mid-sized models can surprisingly outperform larger counterparts in structural efficiency.
arXiv Detail & Related papers (2026-01-09T07:38:27Z) - Enhancing Interpretability in Generative Modeling: Statistically Disentangled Latent Spaces Guided by Generative Factors in Scientific Datasets [0.0]
We introduce Aux-VAE, a novel architecture within the classical Variational Autoencoder framework.<n>We validate the efficacy of Aux-VAE through comparative assessments on multiple datasets, including astronomical simulations.
arXiv Detail & Related papers (2025-06-30T22:29:01Z) - AlphaFold Database Debiasing for Robust Inverse Folding [58.792020809180336]
We introduce a Debiasing Structure AutoEncoder (DeSAE) that learns to reconstruct native-like conformations from intentionally corrupted backbone geometries.<n>At inference, applying DeSAE to AFDB structures produces debiased structures that significantly improve inverse folding performance.
arXiv Detail & Related papers (2025-06-10T02:25:31Z) - Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning [56.22841701016295]
Supervised Causal Learning (SCL) is an emerging paradigm in this field.<n>Existing Deep Neural Network (DNN)-based methods commonly adopt the "Node-Edge approach"
arXiv Detail & Related papers (2025-02-15T19:10:35Z) - Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC.<n>We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss.<n> Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z) - Induced Covariance for Causal Discovery in Linear Sparse Structures [55.2480439325792]
Causal models seek to unravel the cause-effect relationships among variables from observed data.
This paper introduces a novel causal discovery algorithm designed for settings in which variables exhibit linearly sparse relationships.
arXiv Detail & Related papers (2024-10-02T04:01:38Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained
Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states.
We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs.
Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z) - Inference-InfoGAN: Inference Independence via Embedding Orthogonal Basis
Expansion [2.198430261120653]
Disentanglement learning aims to construct independent and interpretable latent variables in which generative models are a popular strategy.
We propose a novel GAN-based disentanglement framework via embedding Orthogonal Basis Expansion (OBE) into InfoGAN network.
Our Inference-InfoGAN achieves higher disentanglement score in terms of FactorVAE, Separated ferenceAttribute Predictability (SAP), Mutual Information Gap (MIG) and Variation Predictability (VP) metrics without model fine-tuning.
arXiv Detail & Related papers (2021-10-02T11:54:23Z) - Demystifying Inductive Biases for $\eta$-VAE Based Architectures [19.53632220171481]
We shed light on the inductive bias responsible for the success of VAE-based architectures.
We show that in classical datasets the structure of variance, induced by the generating factors, is conveniently aligned with the latent directions fostered by the VAE objective.
arXiv Detail & Related papers (2021-02-12T23:57:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.