Re-Assessing the "Classify and Count" Quantification Method
- URL: http://arxiv.org/abs/2011.02552v2
- Date: Fri, 22 Jan 2021 17:32:23 GMT
- Title: Re-Assessing the "Classify and Count" Quantification Method
- Authors: Alejandro Moreo and Fabrizio Sebastiani
- Abstract summary: "Classify and Count" (CC) is often a biased estimator.
Previous works have failed to use properly optimised versions of CC.
We argue that, while still inferior to some cutting-edge methods, they deliver near-state-of-the-art accuracy.
- Score: 88.60021378715636
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Learning to quantify (a.k.a.\ quantification) is a task concerned with
training unbiased estimators of class prevalence via supervised learning. This
task originated with the observation that "Classify and Count" (CC), the
trivial method of obtaining class prevalence estimates, is often a biased
estimator, and thus delivers suboptimal quantification accuracy; following this
observation, several methods for learning to quantify have been proposed that
have been shown to outperform CC. In this work we contend that previous works
have failed to use properly optimised versions of CC. We thus reassess the real
merits of CC (and its variants), and argue that, while still inferior to some
cutting-edge methods, they deliver near-state-of-the-art accuracy once (a)
hyperparameter optimisation is performed, and (b) this optimisation is
performed by using a true quantification loss instead of a standard
classification-based loss. Experiments on three publicly available binary
sentiment classification datasets support these conclusions.
Related papers
- Prediction Error-based Classification for Class-Incremental Learning [39.91805363069707]
We introduce Prediction Error-based Classification (PEC)
PEC computes a class score by measuring the prediction error of a model trained to replicate the outputs of a frozen random neural network on data from that class.
PEC offers several practical advantages, including sample efficiency, ease of tuning, and effectiveness even when data are presented one class at a time.
arXiv Detail & Related papers (2023-05-30T07:43:35Z) - Decoupled Training for Long-Tailed Classification With Stochastic
Representations [15.990318581975435]
Decoupling representation learning and learning has been shown to be effective in classification with long-tailed data.
We first apply Weight Averaging (SWA), an optimization technique for improving generalization of deep neural networks, to obtain better generalizing feature extractors for long-tailed classification.
We then propose a novel classifier re-training algorithm based on perturbed representation obtained from the SWA-Gaussian, a Gaussian SWA, and a self-distillation strategy.
arXiv Detail & Related papers (2023-04-19T05:35:09Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Adaptive Dimension Reduction and Variational Inference for Transductive
Few-Shot Classification [2.922007656878633]
We propose a new clustering method based on Variational Bayesian inference, further improved by Adaptive Dimension Reduction.
Our proposed method significantly improves accuracy in the realistic unbalanced transductive setting on various Few-Shot benchmarks.
arXiv Detail & Related papers (2022-09-18T10:29:02Z) - ProBoost: a Boosting Method for Probabilistic Classifiers [55.970609838687864]
ProBoost is a new boosting algorithm for probabilistic classifiers.
It uses the uncertainty of each training sample to determine the most challenging/uncertain ones.
It produces a sequence that progressively focuses on the samples found to have the highest uncertainty.
arXiv Detail & Related papers (2022-09-04T12:49:20Z) - Cascaded Classifier for Pareto-Optimal Accuracy-Cost Trade-Off Using
off-the-Shelf ANNs [0.0]
We derive a methodology to maximize accuracy and efficiency of cascaded classifiers.
The multi-stage realization can be employed to optimize any state-of-the-art classifier.
arXiv Detail & Related papers (2021-10-27T08:16:11Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Selective Classification via One-Sided Prediction [54.05407231648068]
One-sided prediction (OSP) based relaxation yields an SC scheme that attains near-optimal coverage in the practically relevant high target accuracy regime.
We theoretically derive bounds generalization for SC and OSP, and empirically we show that our scheme strongly outperforms state of the art methods in coverage at small error levels.
arXiv Detail & Related papers (2020-10-15T16:14:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.