The Efficiency Misnomer
- URL: http://arxiv.org/abs/2110.12894v1
- Date: Mon, 25 Oct 2021 12:48:07 GMT
- Title: The Efficiency Misnomer
- Authors: Mostafa Dehghani and Anurag Arnab and Lucas Beyer and Ashish Vaswani
and Yi Tay
- Abstract summary: We discuss common cost indicators, their advantages and disadvantages, and how they can contradict each other.
We demonstrate how incomplete reporting of cost indicators can lead to partial conclusions and a blurred or incomplete picture of the practical considerations of different models.
- Score: 50.69516433266469
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model efficiency is a critical aspect of developing and deploying machine
learning models. Inference time and latency directly affect the user
experience, and some applications have hard requirements. In addition to
inference costs, model training also have direct financial and environmental
impacts. Although there are numerous well-established metrics (cost indicators)
for measuring model efficiency, researchers and practitioners often assume that
these metrics are correlated with each other and report only few of them. In
this paper, we thoroughly discuss common cost indicators, their advantages and
disadvantages, and how they can contradict each other. We demonstrate how
incomplete reporting of cost indicators can lead to partial conclusions and a
blurred or incomplete picture of the practical considerations of different
models. We further present suggestions to improve reporting of efficiency
metrics.
Related papers
- MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate [24.92465108034783]
Large Language Models (LLMs) have shown exceptional results on current benchmarks when working individually.
The advancement in their capabilities, along with a reduction in parameter size and inference times, has facilitated the use of these models as agents.
We evaluate the behavior of a network of models collaborating through debate under the influence of an adversary.
arXiv Detail & Related papers (2024-06-20T20:09:37Z) - PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning [49.60634126342945]
Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes.
Recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information.
We employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues.
arXiv Detail & Related papers (2024-06-09T07:29:55Z) - Causal Fair Metric: Bridging Causality, Individual Fairness, and
Adversarial Robustness [7.246701762489971]
Adversarial perturbation, used to identify vulnerabilities in models, and individual fairness, aiming for equitable treatment of similar individuals, both depend on metrics to generate comparable input data instances.
Previous attempts to define such joint metrics often lack general assumptions about data or structural causal models and were unable to reflect counterfactual proximity.
This paper introduces a causal fair metric formulated based on causal structures encompassing sensitive attributes and protected causal perturbation.
arXiv Detail & Related papers (2023-10-30T09:53:42Z) - On the Trade-offs between Adversarial Robustness and Actionable Explanations [32.05150063480917]
We make one of the first attempts at studying the impact of adversarially robust models on actionable explanations.
We derive theoretical bounds on the differences between the cost and the validity of recourses generated by state-of-the-art algorithms.
Our results show that adversarially robust models significantly increase the cost and reduce the validity of the resulting recourses.
arXiv Detail & Related papers (2023-09-28T13:59:50Z) - Striving for data-model efficiency: Identifying data externalities on
group performance [75.17591306911015]
Building trustworthy, effective, and responsible machine learning systems hinges on understanding how differences in training data and modeling decisions interact to impact predictive performance.
We focus on a particular type of data-model inefficiency, in which adding training data from some sources can actually lower performance evaluated on key sub-groups of the population.
Our results indicate that data-efficiency is a key component of both accurate and trustworthy machine learning.
arXiv Detail & Related papers (2022-11-11T16:48:27Z) - Understanding Factual Errors in Summarization: Errors, Summarizers,
Datasets, Error Detectors [105.12462629663757]
In this work, we aggregate factuality error annotations from nine existing datasets and stratify them according to the underlying summarization model.
We compare performance of state-of-the-art factuality metrics, including recent ChatGPT-based metrics, on this stratified benchmark and show that their performance varies significantly across different types of summarization models.
arXiv Detail & Related papers (2022-05-25T15:26:48Z) - Individual Explanations in Machine Learning Models: A Survey for
Practitioners [69.02688684221265]
The use of sophisticated statistical models that influence decisions in domains of high societal relevance is on the rise.
Many governments, institutions, and companies are reluctant to their adoption as their output is often difficult to explain in human-interpretable ways.
Recently, the academic literature has proposed a substantial amount of methods for providing interpretable explanations to machine learning models.
arXiv Detail & Related papers (2021-04-09T01:46:34Z) - MixKD: Towards Efficient Distillation of Large-scale Language Models [129.73786264834894]
We propose MixKD, a data-agnostic distillation framework, to endow the resulting model with stronger generalization ability.
We prove from a theoretical perspective that under reasonable conditions MixKD gives rise to a smaller gap between the error and the empirical error.
Experiments under a limited-data setting and ablation studies further demonstrate the advantages of the proposed approach.
arXiv Detail & Related papers (2020-11-01T18:47:51Z) - Metrics for Benchmarking and Uncertainty Quantification: Quality,
Applicability, and a Path to Best Practices for Machine Learning in Chemistry [0.0]
This review aims to draw attention to two issues of concern when we set out to make machine learning benchmarking work in the chemical and materials domain.
They are often overlooked or underappreciated topics as chemists typically only have limited training in statistics.
These metrics are also key to comparing the performance of different models and thus for developing guidelines and best practices for the successful application of machine learning in chemistry.
arXiv Detail & Related papers (2020-09-30T21:19:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.