Neural SDEs for Conditional Time Series Generation and the
Signature-Wasserstein-1 metric
- URL: http://arxiv.org/abs/2301.01315v1
- Date: Tue, 3 Jan 2023 19:08:01 GMT
- Title: Neural SDEs for Conditional Time Series Generation and the
Signature-Wasserstein-1 metric
- Authors: Pere D\'iaz Lozano, Toni Lozano Bag\'en, Josep Vives
- Abstract summary: (Conditional) Generative Adversarial Networks (GANs) have found great success in recent years, due to their ability to approximate (conditional) distributions over extremely high dimensional spaces.
They are highly unstable and computationally expensive to train, especially in the time series setting.
Recently, it has been proposed the use of a key object in rough path theory, called the signature of a path, which is able to convert the min-max formulation given by the (conditional) GAN framework into a classical minimization problem.
This method is extremely expensive in terms of memory cost, sometimes becoming prohibitive.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: (Conditional) Generative Adversarial Networks (GANs) have found great success
in recent years, due to their ability to approximate (conditional)
distributions over extremely high dimensional spaces. However, they are highly
unstable and computationally expensive to train, especially in the time series
setting. Recently, it has been proposed the use of a key object in rough path
theory, called the signature of a path, which is able to convert the min-max
formulation given by the (conditional) GAN framework into a classical
minimization problem. However, this method is extremely expensive in terms of
memory cost, sometimes even becoming prohibitive. To overcome this, we propose
the use of \textit{Conditional Neural Stochastic Differential Equations}, which
have a constant memory cost as a function of depth, being more memory efficient
than traditional deep learning architectures. We empirically test that this
proposed model is more efficient than other classical approaches, both in terms
of memory cost and computational time, and that it usually outperforms them in
terms of performance.
Related papers
- Cost-Efficient Continual Learning with Sufficient Exemplar Memory [55.77835198580209]
Continual learning (CL) research typically assumes highly constrained exemplar memory resources.
In this work, we investigate CL in a novel setting where exemplar memory is ample.
Our method achieves state-of-the-art performance while reducing the computational cost to a quarter or third of existing methods.
arXiv Detail & Related papers (2025-02-11T05:40:52Z) - FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training [51.39495282347475]
We introduce $textttFRUGAL$ ($textbfF$ull-$textbfR$ank $textbfU$pdates with $textbfG$r$textbfA$dient sp$textbfL$itting, a new memory-efficient optimization framework.
Our framework can be integrated with various low-rank update selection techniques, including GaLore and BAdam.
arXiv Detail & Related papers (2024-11-12T14:41:07Z) - Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs [61.40047491337793]
We present Hierarchical cOntext MERging (HOMER), a new training-free scheme designed to overcome the limitations of large language models.
HomeR uses a divide-and-conquer algorithm, dividing long inputs into manageable chunks.
A token reduction technique precedes each merging, ensuring memory usage efficiency.
arXiv Detail & Related papers (2024-04-16T06:34:08Z) - A multiobjective continuation method to compute the regularization path of deep neural networks [1.3654846342364308]
Sparsity is a highly feature in deep neural networks (DNNs) since it ensures numerical efficiency, improves the interpretability of models, and robustness.
We present an algorithm that allows for the entire sparse front for the above-mentioned objectives in a very efficient manner for high-dimensional gradients with millions of parameters.
We demonstrate that knowledge of the regularization path allows for a well-generalizing network parametrization.
arXiv Detail & Related papers (2023-08-23T10:08:52Z) - Deep Equilibrium Optical Flow Estimation [80.80992684796566]
Recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms.
These RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation.
We propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer.
arXiv Detail & Related papers (2022-04-18T17:53:44Z) - DCT-Former: Efficient Self-Attention with Discrete Cosine Transform [4.622165486890318]
An intrinsic limitation of the Trasformer architectures arises from the computation of the dot-product attention.
Our idea takes inspiration from the world of lossy data compression (such as the JPEG algorithm) to derive an approximation of the attention module.
An extensive section of experiments shows that our method takes up less memory for the same performance, while also drastically reducing inference time.
arXiv Detail & Related papers (2022-03-02T15:25:27Z) - PDE-Based Optimal Strategy for Unconstrained Online Learning [40.61498562988079]
We present a framework that generates time-varying potential functions by solving a Partial Differential Equation (PDE)
Our framework recovers some classical potentials, and more importantly provides a systematic approach to design new ones.
This is the first parameter-free algorithm with optimal leading constant.
arXiv Detail & Related papers (2022-01-19T22:21:21Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Stabilizing Equilibrium Models by Jacobian Regularization [151.78151873928027]
Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer.
We propose a regularization scheme for DEQ models that explicitly regularizes the Jacobian of the fixed-point update equations to stabilize the learning of equilibrium models.
We show that this regularization adds only minimal computational cost, significantly stabilizes the fixed-point convergence in both forward and backward passes, and scales well to high-dimensional, realistic domains.
arXiv Detail & Related papers (2021-06-28T00:14:11Z) - Optimal Stopping via Randomized Neural Networks [6.677219861416146]
This paper presents the benefits of using randomized neural networks instead of standard basis functions or deep neural networks.
Our approaches are applicable to high dimensional problems where the existing approaches become increasingly impractical.
In all cases, our algorithms outperform the state-of-the-art and other relevant machine learning approaches in terms of time.
arXiv Detail & Related papers (2021-04-28T09:47:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.