Regularization-Based Methods for Ordinal Quantification
- URL: http://arxiv.org/abs/2310.09210v1
- Date: Fri, 13 Oct 2023 16:04:06 GMT
- Title: Regularization-Based Methods for Ordinal Quantification
- Authors: Mirko Bunse, Alejandro Moreo, Fabrizio Sebastiani, Martin Senz
- Abstract summary: We study the ordinal case, i.e., the case in which a total order is defined on the set of n>2 classes.
We propose a novel class of regularized OQ algorithms, which outperforms existing algorithms in our experiments.
- Score: 49.606912965922504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quantification, i.e., the task of training predictors of the class prevalence
values in sets of unlabeled data items, has received increased attention in
recent years. However, most quantification research has concentrated on
developing algorithms for binary and multiclass problems in which the classes
are not ordered. Here, we study the ordinal case, i.e., the case in which a
total order is defined on the set of n>2 classes. We give three main
contributions to this field. First, we create and make available two datasets
for ordinal quantification (OQ) research that overcome the inadequacies of the
previously available ones. Second, we experimentally compare the most important
OQ algorithms proposed in the literature so far. To this end, we bring together
algorithms proposed by authors from very different research fields, such as
data mining and astrophysics, who were unaware of each others' developments.
Third, we propose a novel class of regularized OQ algorithms, which outperforms
existing algorithms in our experiments. The key to this gain in performance is
that our regularization prevents ordinally implausible estimates, assuming that
ordinal distributions tend to be smooth in practice. We informally verify this
assumption for several real-world applications.
Related papers
- Absolute Ranking: An Essential Normalization for Benchmarking Optimization Algorithms [0.0]
evaluating performance across optimization algorithms on many problems presents a complex challenge due to the diversity of numerical scales involved.
This paper extensively explores the problem, making a compelling case to underscore the issue and conducting a thorough analysis of its root causes.
Building on this research, this paper introduces a new mathematical model called "absolute ranking" and a sampling-based computational method.
arXiv Detail & Related papers (2024-09-06T00:55:03Z) - A Gold Standard Dataset for the Reviewer Assignment Problem [117.59690218507565]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.
Our dataset consists of 477 self-reported expertise scores provided by 58 researchers.
For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases.
arXiv Detail & Related papers (2023-03-23T16:15:03Z) - Towards Better Out-of-Distribution Generalization of Neural Algorithmic
Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks.
The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z) - HARRIS: Hybrid Ranking and Regression Forests for Algorithm Selection [75.84584400866254]
We propose a new algorithm selector leveraging special forests, combining the strengths of both approaches while alleviating their weaknesses.
HARRIS' decisions are based on a forest model, whose trees are created based on optimized on a hybrid ranking and regression loss function.
arXiv Detail & Related papers (2022-10-31T14:06:11Z) - Stochastic Differentially Private and Fair Learning [7.971065005161566]
We provide the first differentially private algorithm for fair learning that is guaranteed to converge.
Our framework is flexible enough to permit different fairness, including demographic parity and equalized odds.
Our algorithm can be applied to non-binary classification tasks with multiple (non-binary) sensitive attributes.
arXiv Detail & Related papers (2022-10-17T06:54:57Z) - Rank-based Non-dominated Sorting [0.0]
We introduce Rank Sort, a non-dominated sorting approach exploiting sorting stability and ordinal information to avoid expensive dominance comparisons.
Two algorithmic variants are proposed: the first one, RankOrdinal (RO), uses ordinal rank comparisons in order to determine dominance and requires O(N) space; the second one, RankIntersect (RS), uses set intersections and bit-level parallelism and requires O(N2) space.
We demonstrate the efficiency of the proposed methods in comparison with other state of the art algorithms in empirical simulations using the NSGA2 algorithm as well as synthetic benchmarks.
arXiv Detail & Related papers (2022-03-25T13:59:42Z) - A Comparative Evaluation of Quantification Methods [3.1499058381005227]
Quantification represents the problem of predicting class distributions in a dataset.
A large variety of different algorithms has been proposed in recent years.
We compare 24 different methods on overall more than 40 data sets.
arXiv Detail & Related papers (2021-03-04T18:51:06Z) - Run2Survive: A Decision-theoretic Approach to Algorithm Selection based
on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime.
We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive.
In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z) - Learning to Accelerate Heuristic Searching for Large-Scale Maximum
Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.
Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource.
We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.