Comprehensive Algorithm Portfolio Evaluation using Item Response Theory
- URL: http://arxiv.org/abs/2307.15850v1
- Date: Sat, 29 Jul 2023 00:48:29 GMT
- Title: Comprehensive Algorithm Portfolio Evaluation using Item Response Theory
- Authors: Sevvandi Kandanaarachchi, Kate Smith-Miles
- Abstract summary: IRT has been applied to evaluate machine learning algorithm performance on a single classification dataset.
We present a modified IRT-based framework for evaluating a portfolio of algorithms across a repository of datasets.
- Score: 0.19116784879310023
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Item Response Theory (IRT) has been proposed within the field of Educational
Psychometrics to assess student ability as well as test question difficulty and
discrimination power. More recently, IRT has been applied to evaluate machine
learning algorithm performance on a single classification dataset, where the
student is now an algorithm, and the test question is an observation to be
classified by the algorithm. In this paper we present a modified IRT-based
framework for evaluating a portfolio of algorithms across a repository of
datasets, while simultaneously eliciting a richer suite of characteristics -
such as algorithm consistency and anomalousness - that describe important
aspects of algorithm performance. These characteristics arise from a novel
inversion and reinterpretation of the traditional IRT model without requiring
additional dataset feature computations. We test this framework on algorithm
portfolios for a wide range of applications, demonstrating the broad
applicability of this method as an insightful algorithm evaluation tool.
Furthermore, the explainable nature of IRT parameters yield an increased
understanding of algorithm portfolios.
Related papers
- An Item Response Theory-based R Module for Algorithm Portfolio Analysis [2.8642825441965645]
This paper introduces an Item Response Theory based analysis tool for algorithm portfolio evaluation called AIRT-Module.
Adapting IRT to algorithm evaluation, the AIRT-Module contains a Shiny web application and the R package airt.
The strengths and weaknesses of algorithms are visualised using the difficulty spectrum of the test instances.
arXiv Detail & Related papers (2024-08-26T05:31:46Z) - Unlock the Power of Algorithm Features: A Generalization Analysis for Algorithm Selection [25.29451529910051]
We propose the first provable guarantee for algorithm selection based on algorithm features.
We analyze the benefits and costs associated with algorithm features and investigate how the generalization error is affected by different factors.
arXiv Detail & Related papers (2024-05-18T17:38:25Z) - Neural Active Learning Beyond Bandits [69.99592173038903]
We study both stream-based and pool-based active learning with neural network approximations.
We propose two algorithms based on the newly designed exploitation and exploration neural networks for stream-based and pool-based active learning.
arXiv Detail & Related papers (2024-04-18T21:52:14Z) - Geometry-Aware Approaches for Balancing Performance and Theoretical
Guarantees in Linear Bandits [6.907555940790131]
Thompson sampling and Greedy demonstrate promising empirical performance, yet this contrasts with their pessimistic theoretical regret bounds.
We propose a new data-driven technique that tracks the geometric properties of the uncertainty ellipsoid.
We identify and course-correct" problem instances in which the base algorithms perform poorly.
arXiv Detail & Related papers (2023-06-26T17:38:45Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - An evaluation framework for dimensionality reduction through sectional
curvature [59.40521061783166]
In this work, we aim to introduce the first highly non-supervised dimensionality reduction performance metric.
To test its feasibility, this metric has been used to evaluate the performance of the most commonly used dimension reduction algorithms.
A new parameterized problem instance generator has been constructed in the form of a function generator.
arXiv Detail & Related papers (2023-03-17T11:59:33Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Adaptive Resonance Theory-based Topological Clustering with a Divisive
Hierarchical Structure Capable of Continual Learning [8.581682204722894]
This paper proposes an ART-based topological clustering algorithm with a mechanism that automatically estimates a similarity threshold from a distribution of data points.
For the improving information extraction performance, a divisive hierarchical clustering algorithm capable of continual learning is proposed.
arXiv Detail & Related papers (2022-01-26T02:34:52Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Online Deterministic Annealing for Classification and Clustering [0.0]
We introduce an online prototype-based learning algorithm for clustering and classification.
We show that the proposed algorithm constitutes a competitive-learning neural network, the learning rule of which is formulated as an online approximation algorithm.
arXiv Detail & Related papers (2021-02-11T04:04:21Z) - Active Model Estimation in Markov Decision Processes [108.46146218973189]
We study the problem of efficient exploration in order to learn an accurate model of an environment, modeled as a Markov decision process (MDP)
We show that our Markov-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime.
arXiv Detail & Related papers (2020-03-06T16:17:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.