Pythae: Unifying Generative Autoencoders in Python -- A Benchmarking Use
Case
- URL: http://arxiv.org/abs/2206.08309v2
- Date: Thu, 20 Jul 2023 05:32:00 GMT
- Title: Pythae: Unifying Generative Autoencoders in Python -- A Benchmarking Use
Case
- Authors: Cl\'ement Chadebec and Louis J. Vincent and St\'ephanie
Allassonni\`ere
- Abstract summary: We present Pythae, a versatile open-source Python library providing straightforward, reproducible and reliable use of generative autoencoder models.
We present and compare 19 generative autoencoder models representative of some of the main improvements on downstream tasks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, deep generative models have attracted increasing interest
due to their capacity to model complex distributions. Among those models,
variational autoencoders have gained popularity as they have proven both to be
computationally efficient and yield impressive results in multiple fields.
Following this breakthrough, extensive research has been done in order to
improve the original publication, resulting in a variety of different VAE
models in response to different tasks. In this paper we present Pythae, a
versatile open-source Python library providing both a unified implementation
and a dedicated framework allowing straightforward, reproducible and reliable
use of generative autoencoder models. We then propose to use this library to
perform a case study benchmark where we present and compare 19 generative
autoencoder models representative of some of the main improvements on
downstream tasks such as image reconstruction, generation, classification,
clustering and interpolation. The open-source library can be found at
https://github.com/clementchadebec/benchmark_VAE.
Related papers
- Transformer Architecture for NetsDB [0.0]
We create an end-to-end implementation of a transformer for deep learning model serving in NetsDB.
We load out weights from our model for distributed processing, deployment, and efficient inferencing.
arXiv Detail & Related papers (2024-05-08T04:38:36Z) - CyNetDiff -- A Python Library for Accelerated Implementation of Network Diffusion Models [0.9831489366502302]
CyNetDiff is a Python library with components written in Cython to provide improved performance for these computationally intensive diffusion tasks.
In many research tasks, these simulations are the most computationally intensive task, so it would be desirable to have a library for these with an interface to a high-level language.
arXiv Detail & Related papers (2024-04-25T21:59:55Z) - Mixture-Models: a one-stop Python Library for Model-based Clustering
using various Mixture Models [4.60168321737677]
textttMixture-Models is an open-source Python library for fitting Gaussian Mixture Models (GMM) and their variants.
It streamlines the implementation and analysis of these models using various first/second order optimization routines.
The library provides user-friendly model evaluation tools, such as BIC, AIC, and log-likelihood estimation.
arXiv Detail & Related papers (2024-02-08T19:34:24Z) - eipy: An Open-Source Python Package for Multi-modal Data Integration
using Heterogeneous Ensembles [3.465746303617158]
eipy is an open-source Python package for developing effective, multi-modal heterogeneous ensembles for classification.
eipy provides both a rigorous, and user-friendly framework for comparing and selecting the best-performing data integration and predictive modeling methods.
arXiv Detail & Related papers (2024-01-17T20:07:47Z) - Multi-Candidate Speculative Decoding [82.05519287513444]
Large language models have shown impressive capabilities across a variety of NLP tasks, yet their generating text autoregressively is time-consuming.
One way to speed them up is speculative decoding, which generates candidate segments from a fast draft model that is then verified in parallel by the target model.
This paper proposes sampling multiple candidates from a draft model and then organising them in batches for verification.
We design algorithms for efficient multi-candidate verification while maintaining the distribution of the target model.
arXiv Detail & Related papers (2024-01-12T17:15:23Z) - SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot
Neural Sparse Retrieval [92.27387459751309]
We provide SPRINT, a unified Python toolkit for evaluating neural sparse retrieval.
We establish strong and reproducible zero-shot sparse retrieval baselines across the well-acknowledged benchmark, BEIR.
We show that SPLADEv2 produces sparse representations with a majority of tokens outside of the original query and document.
arXiv Detail & Related papers (2023-07-19T22:48:02Z) - Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes [23.682509357305406]
Autoencoders and their variants are among the most widely used models in representation learning and generative modeling.
We propose a novel Sparse Gaussian Process Bayesian Autoencoder model in which we impose fully sparse Gaussian Process priors on the latent space of a Bayesian Autoencoder.
arXiv Detail & Related papers (2023-02-09T09:57:51Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z) - Twist Decoding: Diverse Generators Guide Each Other [116.20780037268801]
We introduce Twist decoding, a simple and general inference algorithm that generates text while benefiting from diverse models.
Our method does not assume the vocabulary, tokenization or even generation order is shared.
arXiv Detail & Related papers (2022-05-19T01:27:53Z) - Merlion: A Machine Learning Library for Time Series [73.46386700728577]
Merlion is an open-source machine learning library for time series.
It features a unified interface for models and datasets for anomaly detection and forecasting.
Merlion also provides a unique evaluation framework that simulates the live deployment and re-training of a model in production.
arXiv Detail & Related papers (2021-09-20T02:03:43Z) - Multi-layer Optimizations for End-to-End Data Analytics [71.05611866288196]
We introduce Iterative Functional Aggregate Queries (IFAQ), a framework that realizes an alternative approach.
IFAQ treats the feature extraction query and the learning task as one program given in the IFAQ's domain-specific language.
We show that a Scala implementation of IFAQ can outperform mlpack, Scikit, and specialization by several orders of magnitude for linear regression and regression tree models over several relational datasets.
arXiv Detail & Related papers (2020-01-10T16:14:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.