Related papers: Regularized Sequential Latent Variable Models with Adversarial Neural Networks

Regularized Sequential Latent Variable Models with Adversarial Neural Networks

URL: http://arxiv.org/abs/2108.04496v1
Date: Tue, 10 Aug 2021 08:05:14 GMT
Title: Regularized Sequential Latent Variable Models with Adversarial Neural Networks
Authors: Jin Huang, Ming Xiao
Abstract summary: We will present different ways of using high level latent random variables in RNN to model the variability in the sequential data. We will explore possible ways of using adversarial method to train a variational RNN model.
Score: 33.74611654607262
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The recurrent neural networks (RNN) with richly distributed internal states and flexible non-linear transition functions, have overtaken the dynamic Bayesian networks such as the hidden Markov models (HMMs) in the task of modeling highly structured sequential data. These data, such as from speech and handwriting, often contain complex relationships between the underlaying variational factors and the observed data. The standard RNN model has very limited randomness or variability in its structure, coming from the output conditional probability model. This paper will present different ways of using high level latent random variables in RNN to model the variability in the sequential data, and the training method of such RNN model under the VAE (Variational Autoencoder) principle. We will explore possible ways of using adversarial method to train a variational RNN model. Contrary to competing approaches, our approach has theoretical optimum in the model training and provides better model training stability. Our approach also improves the posterior approximation in the variational inference network by a separated adversarial training step. Numerical results simulated from TIMIT speech data show that reconstruction loss and evidence lower bound converge to the same level and adversarial training loss converges to 0.

Related papers

Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network) After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference. We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z)
How Inverse Conditional Flows Can Serve as a Substitute for Distributional Regression [2.9873759776815527]
We propose a framework for distributional regression using inverse flow transformations (DRIFT) DRIFT covers both interpretable statistical models and flexible neural networks opening up new avenues in both statistical modeling and deep learning.
arXiv Detail & Related papers (2024-05-08T21:19:18Z)
Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop. We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models. We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z)
Recurrent neural networks and transfer learning for elasto-plasticity in woven composites [0.0]
This article presents Recurrent Neural Network (RNN) models as a surrogate for computationally intensive meso-scale simulation of woven composites. A mean-field model generates a comprehensive data set representing elasto-plastic behavior. In simulations, arbitrary six-dimensional strain histories are used to predict stresses under random walking as the source task and cyclic loading conditions as the target task.
arXiv Detail & Related papers (2023-11-22T14:47:54Z)
On Feynman--Kac training of partial Bayesian neural networks [1.6474447977095783]
Partial Bayesian neural networks (pBNNs) were shown to perform competitively with full Bayesian neural networks. We propose an efficient sampling-based training strategy, wherein the training of a pBNN is formulated as simulating a Feynman--Kac model. We show that our proposed training scheme outperforms the state of the art in terms of predictive performance.
arXiv Detail & Related papers (2023-10-30T15:03:15Z)
Diffusion-Model-Assisted Supervised Learning of Generative Models for Density Estimation [10.793646707711442]
We present a framework for training generative models for density estimation. We use the score-based diffusion model to generate labeled data. Once the labeled data are generated, we can train a simple fully connected neural network to learn the generative model in the supervised manner.
arXiv Detail & Related papers (2023-10-22T23:56:19Z)
Latent State Models of Training Dynamics [51.88132043461152]
We train models with different random seeds and compute a variety of metrics throughout training. We then fit a hidden Markov model (HMM) over the resulting sequences of metrics. We use the HMM representation to study phase transitions and identify latent "detour" states that slow down convergence.
arXiv Detail & Related papers (2023-08-18T13:20:08Z)
Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers. We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
Markovian RNN: An Adaptive Time Series Prediction Network with HMM-based Switching for Nonstationary Environments [11.716677452529114]
We introduce a novel recurrent neural network (RNN) architecture, which adaptively switches between internal regimes in a Markovian way to model the nonstationary nature of the given data. Our model, Markovian RNN employs a hidden Markov model (HMM) for regime transitions, where each regime controls hidden state transitions of the recurrent cell independently. We demonstrate the significant performance gains compared to vanilla RNN and conventional methods such as Markov Switching ARIMA.
arXiv Detail & Related papers (2020-06-17T19:38:29Z)
Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks. Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities. Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.