Related papers: Neural SDEs for Conditional Time Series Generation and the Signature-Wasserstein-1 metric

Neural SDEs for Conditional Time Series Generation and the Signature-Wasserstein-1 metric

URL: http://arxiv.org/abs/2301.01315v1
Date: Tue, 3 Jan 2023 19:08:01 GMT
Title: Neural SDEs for Conditional Time Series Generation and the Signature-Wasserstein-1 metric
Authors: Pere D\'iaz Lozano, Toni Lozano Bag\'en, Josep Vives
Abstract summary: (Conditional) Generative Adversarial Networks (GANs) have found great success in recent years, due to their ability to approximate (conditional) distributions over extremely high dimensional spaces. They are highly unstable and computationally expensive to train, especially in the time series setting. Recently, it has been proposed the use of a key object in rough path theory, called the signature of a path, which is able to convert the min-max formulation given by the (conditional) GAN framework into a classical minimization problem. This method is extremely expensive in terms of memory cost, sometimes becoming prohibitive.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: (Conditional) Generative Adversarial Networks (GANs) have found great success in recent years, due to their ability to approximate (conditional) distributions over extremely high dimensional spaces. However, they are highly unstable and computationally expensive to train, especially in the time series setting. Recently, it has been proposed the use of a key object in rough path theory, called the signature of a path, which is able to convert the min-max formulation given by the (conditional) GAN framework into a classical minimization problem. However, this method is extremely expensive in terms of memory cost, sometimes even becoming prohibitive. To overcome this, we propose the use of \textit{Conditional Neural Stochastic Differential Equations}, which have a constant memory cost as a function of depth, being more memory efficient than traditional deep learning architectures. We empirically test that this proposed model is more efficient than other classical approaches, both in terms of memory cost and computational time, and that it usually outperforms them in terms of performance.

Related papers

Cost-Efficient Continual Learning with Sufficient Exemplar Memory [55.77835198580209]
Continual learning (CL) research typically assumes highly constrained exemplar memory resources. In this work, we investigate CL in a novel setting where exemplar memory is ample. Our method achieves state-of-the-art performance while reducing the computational cost to a quarter or third of existing methods.
arXiv Detail & Related papers (2025-02-11T05:40:52Z)
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training [51.39495282347475]
We introduce $textttFRUGAL$ ($textbfF$ull-$textbfR$ank $textbfU$pdates with $textbfG$r$textbfA$dient sp$textbfL$itting, a new memory-efficient optimization framework. Our framework can be integrated with various low-rank update selection techniques, including GaLore and BAdam.
arXiv Detail & Related papers (2024-11-12T14:41:07Z)
Sum-of-Squares inspired Quantum Metaheuristic for Polynomial Optimization with the Hadamard Test and Approximate Amplitude Constraints [76.53316706600717]
Recently proposed quantum algorithm arXiv:2206.14999 is based on semidefinite programming (SDP) We generalize the SDP-inspired quantum algorithm to sum-of-squares. Our results show that our algorithm is suitable for large problems and approximate the best known classicals.
arXiv Detail & Related papers (2024-08-14T19:04:13Z)
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs [61.40047491337793]
We present Hierarchical cOntext MERging (HOMER), a new training-free scheme designed to overcome the limitations of large language models. HomeR uses a divide-and-conquer algorithm, dividing long inputs into manageable chunks. A token reduction technique precedes each merging, ensuring memory usage efficiency.
arXiv Detail & Related papers (2024-04-16T06:34:08Z)
A multiobjective continuation method to compute the regularization path of deep neural networks [1.3654846342364308]
Sparsity is a highly feature in deep neural networks (DNNs) since it ensures numerical efficiency, improves the interpretability of models, and robustness. We present an algorithm that allows for the entire sparse front for the above-mentioned objectives in a very efficient manner for high-dimensional gradients with millions of parameters. We demonstrate that knowledge of the regularization path allows for a well-generalizing network parametrization.
arXiv Detail & Related papers (2023-08-23T10:08:52Z)
Avoiding Barren Plateaus with Classical Deep Neural Networks [0.0]
Vari quantum algorithms (VQAs) are among the most promising algorithms in the era of Noisy Intermediate Scale Quantum Devices. VQAs are applied to a variety of tasks, such as in chemistry simulations, optimization problems, and quantum neural networks. We report on how the use of a classical neural networks in the VQAs input parameters can alleviate the Barren Plateaus phenomenon.
arXiv Detail & Related papers (2022-05-26T15:14:01Z)
Deep Equilibrium Optical Flow Estimation [80.80992684796566]
Recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms. These RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation. We propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer.
arXiv Detail & Related papers (2022-04-18T17:53:44Z)
DCT-Former: Efficient Self-Attention with Discrete Cosine Transform [4.622165486890318]
An intrinsic limitation of the Trasformer architectures arises from the computation of the dot-product attention. Our idea takes inspiration from the world of lossy data compression (such as the JPEG algorithm) to derive an approximation of the attention module. An extensive section of experiments shows that our method takes up less memory for the same performance, while also drastically reducing inference time.
arXiv Detail & Related papers (2022-03-02T15:25:27Z)
PDE-Based Optimal Strategy for Unconstrained Online Learning [40.61498562988079]
We present a framework that generates time-varying potential functions by solving a Partial Differential Equation (PDE) Our framework recovers some classical potentials, and more importantly provides a systematic approach to design new ones. This is the first parameter-free algorithm with optimal leading constant.
arXiv Detail & Related papers (2022-01-19T22:21:21Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Stabilizing Equilibrium Models by Jacobian Regularization [151.78151873928027]
Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer. We propose a regularization scheme for DEQ models that explicitly regularizes the Jacobian of the fixed-point update equations to stabilize the learning of equilibrium models. We show that this regularization adds only minimal computational cost, significantly stabilizes the fixed-point convergence in both forward and backward passes, and scales well to high-dimensional, realistic domains.
arXiv Detail & Related papers (2021-06-28T00:14:11Z)
Optimal Stopping via Randomized Neural Networks [6.677219861416146]
This paper presents the benefits of using randomized neural networks instead of standard basis functions or deep neural networks. Our approaches are applicable to high dimensional problems where the existing approaches become increasingly impractical. In all cases, our algorithms outperform the state-of-the-art and other relevant machine learning approaches in terms of time.
arXiv Detail & Related papers (2021-04-28T09:47:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.