Related papers: Theoretical Convergence Guarantees for Variational Autoencoders

Related papers

Understanding Model Merging: A Unified Generalization Framework for Heterogeneous Experts [36.26786113564521]
Model merging efficiently aggregates capabilities from multiple fine-tuned models into a single one.<n>Despite empirical successes, a unified theory for its effectiveness under heterogeneous finetuning hyper parameters remains missing.<n>We use $L$-Stability theory to analyze the generalization of the merged model $boldsymbolx_avg$.
arXiv Detail & Related papers (2026-01-29T13:22:06Z)
Reinforcement Learning Using known Invariances [54.91261509214309]
This paper develops a theoretical framework for incorporating known group symmetries into kernel-based reinforcement learning.<n>We show that symmetry-aware RL achieves significantly better performance than their standard kernel counterparts.
arXiv Detail & Related papers (2025-11-05T13:56:14Z)
Structured Kernel Regression VAE: A Computationally Efficient Surrogate for GP-VAEs in ICA [0.0]
As a generative model, Variational Autoencoders (VAEs) combine with variational Bayesian inference algorithms.<n>This research demonstrates that, while maintaining ICA performance, SKR-VAE achieves greater computational efficiency and significantly reduced computational burden compared to GP-VAE.
arXiv Detail & Related papers (2025-08-13T11:24:24Z)
Distributional encoding for Gaussian process regression with qualitative inputs [0.7342677574855652]
We show that a generalization based on distributional encoding (DE) makes use of all samples of the target variable for a category.<n>Our approach is validated empirically, and we demonstrate state-of-the-art predictive performance on a variety of synthetic and real-world datasets.
arXiv Detail & Related papers (2025-06-05T09:35:02Z)
Neighbour-Driven Gaussian Process Variational Autoencoders for Scalable Structured Latent Modelling [14.358070928996069]
Gaussian Process (GP) Variational Autoencoders (VAEs) extend standard VAEs by replacing the fully factorised Gaussian prior with a GP prior.<n> performing exact GP inference in large-scale GPVAEs is computationally prohibitive, often forcing existing approaches to rely on restrictive kernel assumptions.<n>We propose a neighbour-driven approximation strategy that exploits local adjacencies in the latent space to achieve scalable GPVAE inference.
arXiv Detail & Related papers (2025-05-22T10:07:33Z)
Layer-wise Quantization for Quantized Optimistic Dual Averaging [75.4148236967503]
We develop a general layer-wise quantization framework with tight variance and code-length bounds, adapting to the heterogeneities over the course of training.<n>We propose a novel Quantized Optimistic Dual Averaging (QODA) algorithm with adaptive learning rates, which achieves competitive convergence rates for monotone VIs.
arXiv Detail & Related papers (2025-05-20T13:53:58Z)
Adaptive sparse variational approximations for Gaussian process regression [6.169364905804677]
We construct a variational approximation to a hierarchical Bayes procedure, and derive upper bounds for the contraction rate of the variational posterior. Our theoretical results are accompanied by numerical analysis both on synthetic and real world data sets.
arXiv Detail & Related papers (2025-04-04T09:57:00Z)
Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various second-moment schemes. Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz of the gradient. Comprehensive experimental evaluations for or fine-tuning diverse FMs, including vision models, large language models, reinforcement learning models, generative adversarial networks, and recurrent neural networks, demonstrate its superior numerical performance compared to various state-of-the-art Directions.
arXiv Detail & Related papers (2025-02-15T12:28:51Z)
On the Convergence Analysis of Over-Parameterized Variational Autoencoders: A Neural Tangent Kernel Perspective [7.580900499231056]
Variational Auto-Encoders (VAEs) have emerged as powerful probabilistic models for generative tasks. This paper provides a mathematical proof of VAE under mild assumptions. We also establish a novel connection between the optimization problem faced by over-Eized SNNs and the Kernel Ridge (KRR) problem.
arXiv Detail & Related papers (2024-09-09T06:10:31Z)
Variance-Reducing Couplings for Random Features [57.73648780299374]
Random features (RFs) are a popular technique to scale up kernel methods in machine learning. We find couplings to improve RFs defined on both Euclidean and discrete input spaces. We reach surprising conclusions about the benefits and limitations of variance reduction as a paradigm.
arXiv Detail & Related papers (2024-05-26T12:25:09Z)
Matching aggregate posteriors in the variational autoencoder [0.5759862457142761]
The variational autoencoder (VAE) is a well-studied, deep, latent-variable model (DLVM) This paper addresses shortcomings in VAEs by reformulating the objective function associated with VAEs in order to match the aggregate/marginal posterior distribution to the prior. The proposed method is named the emphaggregate variational autoencoder (AVAE) and is built on the theoretical framework of the VAE.
arXiv Detail & Related papers (2023-11-13T19:22:37Z)
A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms. We develop a general framework from the perspective of Bregman minimization divergence. We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z)
Permutation Compressors for Provably Faster Distributed Nonconvex Optimization [68.8204255655161]
We show that the MARINA method of Gorbunov et al (2021) can be considered as a state-of-the-art method in terms of theoretical communication complexity. Theory of MARINA to support the theory of potentially em correlated compressors, extends to the method beyond the classical independent compressors setting.
arXiv Detail & Related papers (2021-10-07T09:38:15Z)
Forecasting High-Dimensional Covariance Matrices of Asset Returns with Hybrid GARCH-LSTMs [0.0]
This paper investigates the ability of hybrid models, mixing GARCH processes and neural networks, to forecast covariance matrices of asset returns. The new model proposed is very promising as it not only outperforms the equally weighted portfolio, but also by a significant margin its econometric counterpart.
arXiv Detail & Related papers (2021-08-25T23:41:43Z)
Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models. We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs. Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z)
Beyond the Mean-Field: Structured Deep Gaussian Processes Improve the Predictive Uncertainties [12.068153197381575]
We propose a novel variational family that allows for retaining covariances between latent processes while achieving fast convergence. We provide an efficient implementation of our new approach and apply it to several benchmark datasets. It yields excellent results and strikes a better balance between accuracy and calibrated uncertainty estimates than its state-of-the-art alternatives.
arXiv Detail & Related papers (2020-05-22T11:10:59Z)
Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval. We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing. This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z)
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates [70.9701218475002]
We introduce a unified convergence analysis of decentralized communication methods. We derive universal convergence rates for several applications. Our proofs rely on weak assumptions.
arXiv Detail & Related papers (2020-03-23T17:49:15Z)
A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-$\ell_1$-Norm Interpolated Classifiers [3.167685495996986]
This paper establishes a precise high-dimensional theory for boosting on separable data. Under a class of statistical models, we provide an exact analysis of the universality error of boosting. We also explicitly pin down the relation between the boosting test error and the optimal Bayes error.
arXiv Detail & Related papers (2020-02-05T00:24:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.