Related papers: QuaPy: A Python-Based Framework for Quantification

QuaPy: A Python-Based Framework for Quantification

URL: http://arxiv.org/abs/2106.11057v1
Date: Fri, 18 Jun 2021 13:57:11 GMT
Title: QuaPy: A Python-Based Framework for Quantification
Authors: Alejandro Moreo, Andrea Esuli, Fabrizio Sebastiani
Abstract summary: QuaPy is an open-source framework for performing quantification (a.k.a. supervised prevalence estimation) It is written in Python and can be installed via pip.
Score: 76.22817970624875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: QuaPy is an open-source framework for performing quantification (a.k.a. supervised prevalence estimation), written in Python. Quantification is the task of training quantifiers via supervised learning, where a quantifier is a predictor that estimates the relative frequencies (a.k.a. prevalence values) of the classes of interest in a sample of unlabelled data. While quantification can be trivially performed by applying a standard classifier to each unlabelled data item and counting how many data items have been assigned to each class, it has been shown that this "classify and count" method is outperformed by methods specifically designed for quantification. QuaPy provides implementations of a number of baseline methods and advanced quantification methods, of routines for quantification-oriented model selection, of several broadly accepted evaluation measures, and of robust evaluation protocols routinely used in the field. QuaPy also makes available datasets commonly used for testing quantifiers, and offers visualization tools for facilitating the analysis and interpretation of the results. The software is open-source and publicly available under a BSD-3 licence via https://github.com/HLT-ISTI/QuaPy, and can be installed via pip (https://pypi.org/project/QuaPy/)

Related papers

Class conditional conformal prediction for multiple inputs by p-value aggregation [11.198836025239963]
We introduce an innovative refinement to conformal prediction methods for classification tasks.<n>Our approach is motivated by applications in citizen science, where multiple images of the same plant or animal are captured by individuals.<n>We evaluate our method on simulated and real data, with a particular focus on Pl@ntNet, a prominent citizen science platform.
arXiv Detail & Related papers (2025-07-09T09:17:17Z)
Query Performance Prediction using Relevance Judgments Generated by Large Language Models [53.97064615557883]
We propose a QPP framework using automatically generated relevance judgments (QPP-GenRE) QPP-GenRE decomposes QPP into independent subtasks of predicting relevance of each item in a ranked list to a given query. This allows us to predict any IR evaluation measure using the generated relevance judgments as pseudo-labels.
arXiv Detail & Related papers (2024-04-01T09:33:05Z)
PyPOTS: A Python Toolbox for Data Mining on Partially-Observed Time Series [0.0]
PyPOTS is an open-source Python library dedicated to data mining and analysis on partially-observed time series. It provides easy access to diverse algorithms categorized into four tasks: imputation, classification, clustering, and forecasting.
arXiv Detail & Related papers (2023-05-30T07:57:05Z)
Revisiting Long-tailed Image Classification: Survey and Benchmarks with New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution. Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z)
DeeProb-kit: a Python Library for Deep Probabilistic Modelling [0.0]
DeeProb-kit is a unified library written in Python consisting of a collection of deep probabilistic models (DPMs) It includes efficiently implemented learning techniques, inference routines, statistical algorithms, and provides high-quality fully-documented APIs.
arXiv Detail & Related papers (2022-12-08T17:02:16Z)
Latte: Cross-framework Python Package for Evaluation of Latent-Based Generative Models [65.51757376525798]
Latte is a Python library for evaluation of latent-based generative models. Latte is compatible with both PyTorch and/Keras, and provides both functional and modular APIs.
arXiv Detail & Related papers (2021-12-20T16:00:28Z)
Scikit-dimension: a Python package for intrinsic dimension estimation [58.8599521537]
This technical note introduces textttscikit-dimension, an open-source Python package for intrinsic dimension estimation. textttscikit-dimension package provides a uniform implementation of most of the known ID estimators based on scikit-learn application programming interface. We briefly describe the package and demonstrate its use in a large-scale (more than 500 datasets) benchmarking of methods for ID estimation in real-life and synthetic data.
arXiv Detail & Related papers (2021-09-06T16:46:38Z)
The Word is Mightier than the Label: Learning without Pointillistic Labels using Data Programming [11.536162323162099]
Most advanced supervised Machine Learning (ML) models rely on vast amounts of point-by-point labelled training examples. Hand-labelling vast amounts of data may be tedious, expensive, and error-prone.
arXiv Detail & Related papers (2021-08-24T19:11:28Z)
An Empirical Comparison of Instance Attribution Methods for NLP [62.63504976810927]
We evaluate the degree to which different potential instance attribution agree with respect to the importance of training samples. We find that simple retrieval methods yield training instances that differ from those identified via gradient-based methods.
arXiv Detail & Related papers (2021-04-09T01:03:17Z)
Quantifying With Only Positive Training Data [0.5735035463793008]
Quantification is the research field that studies methods for counting the number of data points that belong to each class in an unlabeled sample. This article closes the gap between Positive and Unlabeled Learning (PUL) and One-class Quantification (OCQ) We compare our method, Passive Aggressive Threshold (PAT), against PUL methods and show that PAT generally is the fastest and most accurate algorithm.
arXiv Detail & Related papers (2020-04-22T01:18:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.