Related papers: RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis

RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis

URL: http://arxiv.org/abs/2205.11693v1
Date: Tue, 24 May 2022 01:14:59 GMT
Title: RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis
Authors: Mohammad Esmaeilpour, Nourhene Chaalia, Adel Abusitta, Francois-Xavier Devailly, Wissem Maazoun, Patrick Cardinal
Abstract summary: This paper introduces a novel generative adversarial network (GAN) for synthesizing large-scale databases. We propose a new formulation for deriving a vector incorporating both binary and discrete features simultaneously. We present a regularization scheme towards limiting unprecedented variations on its weight vectors during training.
Score: 7.491711487306447
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces a novel generative adversarial network (GAN) for synthesizing large-scale tabular databases which contain various features such as continuous, discrete, and binary. Technically, our GAN belongs to the category of class-conditioned generative models with a predefined conditional vector. However, we propose a new formulation for deriving such a vector incorporating both binary and discrete features simultaneously. We refer to this noble definition as compound conditional vector and employ it for training the generator network. The core architecture of this network is a three-layered deep residual neural network with skip connections. For improving the stability of such complex architecture, we present a regularization scheme towards limiting unprecedented variations on its weight vectors during training. This regularization approach is quite compatible with the nature of adversarial training and it is not computationally prohibitive in runtime. Furthermore, we constantly monitor the variation of the weight vectors for identifying any potential instabilities or irregularities to measure the strength of our proposed regularizer. Toward this end, we also develop a new metric for tracking sudden perturbation on the weight vectors using the singular value decomposition theory. Finally, we evaluate the performance of our proposed synthesis approach on six benchmarking tabular databases, namely Adult, Census, HCDR, Cabs, News, and King. The achieved results corroborate that for the majority of the cases, our proposed RccGAN outperforms other conventional and modern generative models in terms of accuracy, stability, and reliability.

Related papers

Twinning Complex Networked Systems: Data-Driven Calibration of the mABCD Synthetic Graph Generator [2.6776012440607784]
We propose a method for estimating matching configurations and for quantifying the associated error.<n>Our results demonstrate that this task is non-trivial, as strong interdependencies between configuration parameters weaken independent estimation and instead favour a joint-prediction approach.
arXiv Detail & Related papers (2026-02-02T12:40:19Z)
Dictionary-Transform Generative Adversarial Networks [11.62669179647184]
We introduce emphDictionary-Transform Generative Adversarial Networks (DT-GAN)<n> DT-GAN is a model-based adversarial framework in which the generator is a sparse synthesis dictionary and the discriminator is an analysis transform acting as an energy model.<n>We show that the DT-GAN adversarial game is well posed and admits at least one Nash equilibrium.
arXiv Detail & Related papers (2025-12-25T13:41:14Z)
Sequential-Parallel Duality in Prefix Scannable Models [68.39855814099997]
Recent developments have given rise to various models, such as Gated Linear Attention (GLA) and Mamba.<n>This raises a natural question: can we characterize the full class of neural sequence models that support near-constant-time parallel evaluation and linear-time, constant-space sequential inference?
arXiv Detail & Related papers (2025-06-12T17:32:02Z)
Binary Cumulative Encoding meets Time Series Forecasting [0.11704154007740832]
We introduce binary cumulative encoding (BCE) that represents scalar targets into monotonic binary vectors.<n>BCE implicitly preserves order and magnitude information, allowing the model to learn distance-aware representations while still operating within a classification framework.<n>We show that our approach outperforms widely used methods in both point and probabilistic forecasting, while requiring fewer parameters and enabling faster training.
arXiv Detail & Related papers (2025-05-30T13:41:39Z)
Network reconstruction via the minimum description length principle [0.0]
We propose an alternative nonparametric regularization scheme based on hierarchical Bayesian inference and weight quantization. Our approach follows the minimum description length (MDL) principle, and uncovers the weight distribution that allows for the most compression of the data. We demonstrate that our scheme yields systematically increased accuracy in the reconstruction of both artificial and empirical networks.
arXiv Detail & Related papers (2024-05-02T05:35:09Z)
Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown [2.5628953713168685]
We tackle the challenge of outlier-robust predictive modeling using highly expressive neural networks.<n>Our approach integrates two key components: (1) a transformed trimmed loss (TTL), and (2) higher-order variation regularization (HOVR), which imposes smoothness constraints on the prediction function.
arXiv Detail & Related papers (2023-08-04T12:57:13Z)
Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space. We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z)
Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration [122.51142131506639]
We introduce a precise, fast, and differentiable upper bound for the spectral norm of convolutional layers using circulant matrix theory. We show through a comprehensive set of experiments that our approach outperforms other state-of-the-art methods in terms of precision, computational cost, and scalability. It proves highly effective for the Lipschitz regularization of convolutional neural networks, with competitive results against concurrent approaches.
arXiv Detail & Related papers (2023-05-25T15:32:21Z)
Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks [35.6604960300194]
This work examines the challenges of training neural networks using vector quantization using straight-through estimation. We find that a primary cause of training instability is the discrepancy between the model embedding and the code-vector distribution. We identify the factors that contribute to this issue, including the codebook gradient sparsity and the asymmetric nature of the commitment loss.
arXiv Detail & Related papers (2023-05-15T17:56:36Z)
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding Networks [65.34977803841007]
Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience. We show how by simply changing the temporal scheduling of the update rule for the synaptic weights leads to an algorithm that is much more efficient and stable than the original one.
arXiv Detail & Related papers (2022-11-16T00:11:04Z)
Lipschitz Continuity Retained Binary Neural Network [52.17734681659175]
We introduce the Lipschitz continuity as the rigorous criteria to define the model robustness for BNN. We then propose to retain the Lipschitz continuity as a regularization term to improve the model robustness. Our experiments prove that our BNN-specific regularization method can effectively strengthen the robustness of BNN.
arXiv Detail & Related papers (2022-07-13T22:55:04Z)
Orthogonal Stochastic Configuration Networks with Adaptive Construction Parameter for Data Analytics [6.940097162264939]
randomness makes SCNs more likely to generate approximate linear correlative nodes that are redundant and low quality. In light of a fundamental principle in machine learning, that is, a model with fewer parameters holds improved generalization. This paper proposes orthogonal SCN, termed OSCN, to filtrate out the low-quality hidden nodes for network structure reduction.
arXiv Detail & Related papers (2022-05-26T07:07:26Z)
Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption. They can suffer from ill-posedness and convergence instability. This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z)
A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell. This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z)
Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions. We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.