Mixture-Models: a one-stop Python Library for Model-based Clustering
using various Mixture Models
- URL: http://arxiv.org/abs/2402.10229v1
- Date: Thu, 8 Feb 2024 19:34:24 GMT
- Title: Mixture-Models: a one-stop Python Library for Model-based Clustering
using various Mixture Models
- Authors: Siva Rajesh Kasa, Hu Yijie, Santhosh Kumar Kasa, Vaibhav Rajan
- Abstract summary: textttMixture-Models is an open-source Python library for fitting Gaussian Mixture Models (GMM) and their variants.
It streamlines the implementation and analysis of these models using various first/second order optimization routines.
The library provides user-friendly model evaluation tools, such as BIC, AIC, and log-likelihood estimation.
- Score: 4.60168321737677
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: \texttt{Mixture-Models} is an open-source Python library for fitting Gaussian
Mixture Models (GMM) and their variants, such as Parsimonious GMMs, Mixture of
Factor Analyzers, MClust models, Mixture of Student's t distributions, etc. It
streamlines the implementation and analysis of these models using various
first/second order optimization routines such as Gradient Descent and Newton-CG
through automatic differentiation (AD) tools. This helps in extending these
models to high-dimensional data, which is first of its kind among Python
libraries. The library provides user-friendly model evaluation tools, such as
BIC, AIC, and log-likelihood estimation. The source-code is licensed under MIT
license and can be accessed at \url{https://github.com/kasakh/Mixture-Models}.
The package is highly extensible, allowing users to incorporate new
distributions and optimization techniques with ease. We conduct a large scale
simulation to compare the performance of various gradient based approaches
against Expectation Maximization on a wide range of settings and identify the
corresponding best suited approach.
Related papers
- Be More Diverse than the Most Diverse: Online Selection of Diverse Mixtures of Generative Models [33.04472814852163]
In this work, we explore the selection of a mixture of multiple generative models.
We propose an online learning approach called Mixture Upper Confidence Bound (Mixture-UCB)
arXiv Detail & Related papers (2024-12-23T14:48:17Z) - Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild [84.57103623507082]
This paper introduces Model-GLUE, a holistic Large Language Models scaling guideline.
We benchmark existing scaling techniques, especially selective merging, and variants of mixture.
We then formulate an optimal strategy for the selection and aggregation of a heterogeneous model zoo.
Our methodology involves the clustering of mergeable models and optimal merging strategy selection, and the integration of clusters.
arXiv Detail & Related papers (2024-10-07T15:55:55Z) - Fast Semisupervised Unmixing Using Nonconvex Optimization [80.11512905623417]
We introduce a novel convex convex model for semi/library-based unmixing.
We demonstrate the efficacy of Alternating Methods of sparse unsupervised unmixing.
arXiv Detail & Related papers (2024-01-23T10:07:41Z) - eipy: An Open-Source Python Package for Multi-modal Data Integration using Heterogeneous Ensembles [2.957103424179249]
eipy is an open-source Python package for developing effective, multi-modal heterogeneous ensembles for classification.
eipy provides both a rigorous, and user-friendly framework for comparing and selecting the best-performing data integration and predictive modeling methods.
arXiv Detail & Related papers (2024-01-17T20:07:47Z) - Finite Mixtures of Multivariate Poisson-Log Normal Factor Analyzers for
Clustering Count Data [0.8499685241219366]
A class of eight parsimonious mixture models based on the mixtures of factor analyzers model are introduced.
The proposed models are explored in the context of clustering discrete data arising from RNA sequencing studies.
arXiv Detail & Related papers (2023-11-13T21:23:15Z) - Learning with MISELBO: The Mixture Cookbook [62.75516608080322]
We present the first ever mixture of variational approximations for a normalizing flow-based hierarchical variational autoencoder (VAE) with VampPrior and a PixelCNN decoder network.
We explain this cooperative behavior by drawing a novel connection between VI and adaptive importance sampling.
We obtain state-of-the-art results among VAE architectures in terms of negative log-likelihood on the MNIST and FashionMNIST datasets.
arXiv Detail & Related papers (2022-09-30T15:01:35Z) - Pythae: Unifying Generative Autoencoders in Python -- A Benchmarking Use
Case [0.0]
We present Pythae, a versatile open-source Python library providing straightforward, reproducible and reliable use of generative autoencoder models.
We present and compare 19 generative autoencoder models representative of some of the main improvements on downstream tasks.
arXiv Detail & Related papers (2022-06-16T17:11:41Z) - Merlion: A Machine Learning Library for Time Series [73.46386700728577]
Merlion is an open-source machine learning library for time series.
It features a unified interface for models and datasets for anomaly detection and forecasting.
Merlion also provides a unique evaluation framework that simulates the live deployment and re-training of a model in production.
arXiv Detail & Related papers (2021-09-20T02:03:43Z) - Multi-layer Optimizations for End-to-End Data Analytics [71.05611866288196]
We introduce Iterative Functional Aggregate Queries (IFAQ), a framework that realizes an alternative approach.
IFAQ treats the feature extraction query and the learning task as one program given in the IFAQ's domain-specific language.
We show that a Scala implementation of IFAQ can outperform mlpack, Scikit, and specialization by several orders of magnitude for linear regression and regression tree models over several relational datasets.
arXiv Detail & Related papers (2020-01-10T16:14:44Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.