DADApy: Distance-based Analysis of DAta-manifolds in Python
- URL: http://arxiv.org/abs/2205.03373v1
- Date: Wed, 4 May 2022 08:41:59 GMT
- Title: DADApy: Distance-based Analysis of DAta-manifolds in Python
- Authors: Aldo Glielmo, Iuri Macocco, Diego Doimo, Matteo Carli, Claudio Zeni,
Romina Wild, Maria d'Errico, Alex Rodriguez, Alessandro Laio
- Abstract summary: DADApy is a python software package for analysing and characterising high-dimensional data.
It provides methods for estimating the intrinsic dimension and the probability density, for performing density-based clustering and for comparing different distance metrics.
- Score: 51.37841707191944
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: DADApy is a python software package for analysing and characterising
high-dimensional data manifolds. It provides methods for estimating the
intrinsic dimension and the probability density, for performing density-based
clustering and for comparing different distance metrics. We review the main
functionalities of the package and exemplify its usage in toy cases and in a
real-world application. The package is freely available under the open-source
Apache 2.0 license and can be downloaded from the Github page
https://github.com/sissa-data-science/DADApy.
Related papers
- pyvene: A Library for Understanding and Improving PyTorch Models via
Interventions [79.72930339711478]
$textbfpyvene$ is an open-source library that supports customizable interventions on a range of different PyTorch modules.
We show how $textbfpyvene$ provides a unified framework for performing interventions on neural models and sharing the intervened upon models with others.
arXiv Detail & Related papers (2024-03-12T16:46:54Z) - eipy: An Open-Source Python Package for Multi-modal Data Integration
using Heterogeneous Ensembles [3.465746303617158]
eipy is an open-source Python package for developing effective, multi-modal heterogeneous ensembles for classification.
eipy provides both a rigorous, and user-friendly framework for comparing and selecting the best-performing data integration and predictive modeling methods.
arXiv Detail & Related papers (2024-01-17T20:07:47Z) - PyPOTS: A Python Toolbox for Data Mining on Partially-Observed Time
Series [0.0]
PyPOTS is an open-source Python library dedicated to data mining and analysis on partially-observed time series.
It provides easy access to diverse algorithms categorized into four tasks: imputation, classification, clustering, and forecasting.
arXiv Detail & Related papers (2023-05-30T07:57:05Z) - PyGOD: A Python Library for Graph Outlier Detection [56.33769221859135]
PyGOD is an open-source library for detecting outliers in graph data.
It supports a wide array of leading graph-based methods for outlier detection.
PyGOD is released under a BSD 2-Clause license at https://pygod.org and at the Python Package Index (PyPI)
arXiv Detail & Related papers (2022-04-26T06:15:21Z) - PyHHMM: A Python Library for Heterogeneous Hidden Markov Models [63.01207205641885]
PyHHMM is an object-oriented Python implementation of Heterogeneous-Hidden Markov Models (HHMMs)
PyHHMM emphasizes features not supported in similar available frameworks: a heterogeneous observation model, missing data inference, different model order selection criterias, and semi-supervised training.
PyHHMM relies on the numpy, scipy, scikit-learn, and seaborn Python packages, and is distributed under the Apache-2.0 License.
arXiv Detail & Related papers (2022-01-12T07:32:36Z) - Scikit-dimension: a Python package for intrinsic dimension estimation [58.8599521537]
This technical note introduces textttscikit-dimension, an open-source Python package for intrinsic dimension estimation.
textttscikit-dimension package provides a uniform implementation of most of the known ID estimators based on scikit-learn application programming interface.
We briefly describe the package and demonstrate its use in a large-scale (more than 500 datasets) benchmarking of methods for ID estimation in real-life and synthetic data.
arXiv Detail & Related papers (2021-09-06T16:46:38Z) - FDApy: a Python package for functional data [0.0]
FDApy is an open-source Python package for the analysis of functional data.
FDApy provides tools for the representation of functional data defined on different dimensional domains and for functional data that is irregularly sampled.
The documentation includes installation and usage instructions, examples on simulated and real datasets and a complete description of the API.
arXiv Detail & Related papers (2021-01-26T10:07:33Z) - MOGPTK: The Multi-Output Gaussian Process Toolkit [71.08576457371433]
We present MOGPTK, a Python package for multi-channel data modelling using Gaussian processes (GP)
The aim of this toolkit is to make multi-output GP (MOGP) models accessible to researchers, data scientists, and practitioners alike.
arXiv Detail & Related papers (2020-02-09T23:34:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.