What is it for a Machine Learning Model to Have a Capability?
- URL: http://arxiv.org/abs/2405.08989v1
- Date: Tue, 14 May 2024 23:03:52 GMT
- Title: What is it for a Machine Learning Model to Have a Capability?
- Authors: Jacqueline Harding, Nathaniel Sharadin,
- Abstract summary: We develop an account of machine learning models' capabilities which can be usefully applied to the nascent science of model evaluation.
Our core proposal is a conditional analysis of model abilities (CAMA), crudely, a machine learning model has a capability to X just when it would reliably succeed at doing X if it 'tried'
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: What can contemporary machine learning (ML) models do? Given the proliferation of ML models in society, answering this question matters to a variety of stakeholders, both public and private. The evaluation of models' capabilities is rapidly emerging as a key subfield of modern ML, buoyed by regulatory attention and government grants. Despite this, the notion of an ML model possessing a capability has not been interrogated: what are we saying when we say that a model is able to do something? And what sorts of evidence bear upon this question? In this paper, we aim to answer these questions, using the capabilities of large language models (LLMs) as a running example. Drawing on the large philosophical literature on abilities, we develop an account of ML models' capabilities which can be usefully applied to the nascent science of model evaluation. Our core proposal is a conditional analysis of model abilities (CAMA): crudely, a machine learning model has a capability to X just when it would reliably succeed at doing X if it 'tried'. The main contribution of the paper is making this proposal precise in the context of ML, resulting in an operationalisation of CAMA applicable to LLMs. We then put CAMA to work, showing that it can help make sense of various features of ML model evaluation practice, as well as suggest procedures for performing fair inter-model comparisons.
Related papers
- LLAVADI: What Matters For Multimodal Large Language Models Distillation [77.73964744238519]
In this work, we do not propose a new efficient model structure or train small-scale MLLMs from scratch.
Our studies involve training strategies, model choices, and distillation algorithms in the knowledge distillation process.
By evaluating different benchmarks and proper strategy, even a 2.7B small-scale model can perform on par with larger models with 7B or 13B parameters.
arXiv Detail & Related papers (2024-07-28T06:10:47Z) - Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation [0.0]
We compare the capabilities of the most powerful MLLMs to date: ShareGPT4V, ChatGPT, LLaVA-Next in a specialized task of age and gender estimation with our state-of-the-art specialized model, MiVOLO.
This comparison has yielded some interesting results and insights about the strengths and weaknesses of the participating models.
arXiv Detail & Related papers (2024-03-04T18:32:12Z) - MetaVL: Transferring In-Context Learning Ability From Language Models to
Vision-Language Models [74.89629463600978]
In vision-language domain, most large-scale pre-trained vision-language models do not possess the ability to conduct in-context learning.
In this paper, we study an interesting hypothesis: can we transfer the in-context learning ability from the language domain to the vision domain?
arXiv Detail & Related papers (2023-06-02T07:21:03Z) - Specializing Smaller Language Models towards Multi-Step Reasoning [56.78474185485288]
We show that abilities can be distilled down from GPT-3.5 ($ge$ 175B) to T5 variants ($le$ 11B)
We propose model specialization, to specialize the model's ability towards a target task.
arXiv Detail & Related papers (2023-01-30T08:51:19Z) - Rethinking and Recomputing the Value of ML Models [28.80821411530123]
We argue that the way we have been training and evaluating ML models has largely forgotten the fact that they are applied in an organization or societal context.
We show that with this perspective we fundamentally change how we evaluate, select and deploy ML models.
arXiv Detail & Related papers (2022-09-30T01:02:31Z) - The games we play: critical complexity improves machine learning [0.0]
We argue that best practice in Machine Learning should be more consistent with critical complexity perspectives than with rationalist, grand narratives.
We identify thirteen 'games' played in the ML community that lend false legitimacy to models, contribute to over-promise and hype about the capabilities of artificial intelligence, lead to models that exacerbate inequality and cause discrimination.
arXiv Detail & Related papers (2022-05-18T13:37:22Z) - The Need for Interpretable Features: Motivation and Taxonomy [69.07189753428553]
We claim that the term "interpretable feature" is not specific nor detailed enough to capture the full extent to which features impact the usefulness of machine learning explanations.
In this paper, we motivate and discuss three key lessons: 1) more attention should be given to what we refer to as the interpretable feature space, or the state of features that are useful to domain experts taking real-world actions.
arXiv Detail & Related papers (2022-02-23T19:19:14Z) - What do we expect from Multiple-choice QA Systems? [70.86513724662302]
We consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets.
We evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs.
arXiv Detail & Related papers (2020-11-20T21:27:10Z) - Insights into Performance Fitness and Error Metrics for Machine Learning [1.827510863075184]
Machine learning (ML) is the field of training machines to achieve high level of cognition and perform human-like analysis.
This paper examines a number of the most commonly-used performance fitness and error metrics for regression and classification algorithms.
arXiv Detail & Related papers (2020-05-17T22:59:04Z) - An Information-Theoretic Approach to Personalized Explainable Machine
Learning [92.53970625312665]
We propose a simple probabilistic model for the predictions and user knowledge.
We quantify the effect of an explanation by the conditional mutual information between the explanation and prediction.
arXiv Detail & Related papers (2020-03-01T13:06:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.