Semi-supervised Sequential Generative Models
- URL: http://arxiv.org/abs/2007.00155v1
- Date: Tue, 30 Jun 2020 23:53:12 GMT
- Title: Semi-supervised Sequential Generative Models
- Authors: Michael Teng, Tuan Anh Le, Adam Scibior, Frank Wood
- Abstract summary: We introduce a novel objective for training deep generative time-series models with discrete latent variables for which supervision is only sparsely available.
We first overcome this problem by extending the standard semi-supervised generative modeling objective with reweighted wake-sleep.
Finally, we introduce a unified objective inspired by teacher-forcing and show that this approach is robust to variable length supervision.
- Score: 16.23492955875404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel objective for training deep generative time-series
models with discrete latent variables for which supervision is only sparsely
available. This instance of semi-supervised learning is challenging for
existing methods, because the exponential number of possible discrete latent
configurations results in high variance gradient estimators. We first overcome
this problem by extending the standard semi-supervised generative modeling
objective with reweighted wake-sleep. However, we find that this approach still
suffers when the frequency of available labels varies between training
sequences. Finally, we introduce a unified objective inspired by
teacher-forcing and show that this approach is robust to variable length
supervision. We call the resulting method caffeinated wake-sleep (CWS) to
emphasize its additional dependence on real data. We demonstrate its
effectiveness with experiments on MNIST, handwriting, and fruit fly trajectory
data.
Related papers
- Temporal Test-Time Adaptation with State-Space Models [4.248760709042802]
Adapting a model on test samples can help mitigate this drop in performance.
Most test-time adaptation methods have focused on synthetic corruption shifts.
We propose STAD, a probabilistic state-space model that adapts a deployed model to temporal distribution shifts.
arXiv Detail & Related papers (2024-07-17T11:18:49Z) - MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning Process [26.661721555671626]
We introduce a novel Multi-Granularity Time Series (MG-TSD) model, which achieves state-of-the-art predictive performance.
Our approach does not rely on additional external data, making it versatile and applicable across various domains.
arXiv Detail & Related papers (2024-03-09T01:15:03Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method.
We develop practical bounds to apply it to language generation.
We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z) - Semi-Supervised Temporal Action Detection with Proposal-Free Masking [134.26292288193298]
We propose a novel Semi-supervised Temporal action detection model based on PropOsal-free Temporal mask (SPOT)
SPOT outperforms state-of-the-art alternatives, often by a large margin.
arXiv Detail & Related papers (2022-07-14T16:58:47Z) - Distilling Model Failures as Directions in Latent Space [87.30726685335098]
We present a scalable method for automatically distilling a model's failure modes.
We harness linear classifiers to identify consistent error patterns, and induce a natural representation of these failure modes as directions within the feature space.
We demonstrate that this framework allows us to discover and automatically caption challenging subpopulations within the training dataset, and intervene to improve the model's performance on these subpopulations.
arXiv Detail & Related papers (2022-06-29T16:35:24Z) - Training Discrete Deep Generative Models via Gapped Straight-Through
Estimator [72.71398034617607]
We propose a Gapped Straight-Through ( GST) estimator to reduce the variance without incurring resampling overhead.
This estimator is inspired by the essential properties of Straight-Through Gumbel-Softmax.
Experiments demonstrate that the proposed GST estimator enjoys better performance compared to strong baselines on two discrete deep generative modeling tasks.
arXiv Detail & Related papers (2022-06-15T01:46:05Z) - Robust Disentanglement of a Few Factors at a Time [5.156484100374058]
We introduce population-based training (PBT) for improving consistency in training variational autoencoders (VAEs)
We then use Unsupervised Disentanglement Ranking (UDR) as an unsupervised to score models in our PBT-VAE training and show how models trained this way tend to consistently disentangle only a subset of the generative factors.
We show striking improvement in state-of-the-art unsupervised disentanglement performance and robustness across multiple datasets and metrics.
arXiv Detail & Related papers (2020-10-26T12:34:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.