Understanding the Usability Challenges of Machine Learning In
High-Stakes Decision Making
- URL: http://arxiv.org/abs/2103.02071v1
- Date: Tue, 2 Mar 2021 22:50:45 GMT
- Title: Understanding the Usability Challenges of Machine Learning In
High-Stakes Decision Making
- Authors: Alexandra Zytek, Dongyu Liu, Rhema Vaithianathan, and Kalyan
Veeramachaneni
- Abstract summary: Machine learning (ML) is being applied to a diverse and ever-growing set of domains.
In many cases, domain experts -- who often have no expertise in ML or data science -- are asked to use ML predictions to make high-stakes decisions.
We investigate the ML usability challenges present in the domain of child welfare screening through a series of collaborations with child welfare screeners.
- Score: 67.72855777115772
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) is being applied to a diverse and ever-growing set of
domains. In many cases, domain experts -- who often have no expertise in ML or
data science -- are asked to use ML predictions to make high-stakes decisions.
Multiple ML usability challenges can appear as result, such as lack of user
trust in the model, inability to reconcile human-ML disagreement, and ethical
concerns about oversimplification of complex problems to a single algorithm
output. In this paper, we investigate the ML usability challenges present in
the domain of child welfare screening through a series of collaborations with
child welfare screeners, which included field observations, interviews, and a
formal user study. Through our collaborations, we identified four key ML
challenges, and honed in on one promising ML augmentation tool to address them
(local factor contributions). We also composed a list of design considerations
to be taken into account when developing future augmentation tools for child
welfare screeners and similar domain experts.
Related papers
- ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection [60.297079601066784]
We introduce ErrorRadar, the first benchmark designed to assess MLLMs' capabilities in error detection.
ErrorRadar evaluates two sub-tasks: error step identification and error categorization.
It consists of 2,500 high-quality multimodal K-12 mathematical problems, collected from real-world student interactions.
Results indicate significant challenges still remain, as GPT-4o with best performance is still around 10% behind human evaluation.
arXiv Detail & Related papers (2024-10-06T14:59:09Z) - Maintainability Challenges in ML: A Systematic Literature Review [5.669063174637433]
This study aims to identify and synthesise the maintainability challenges in different stages of the Machine Learning workflow.
We screened more than 13000 papers, then selected and qualitatively analysed 56 of them.
arXiv Detail & Related papers (2024-08-17T13:24:15Z) - MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains [54.117238759317004]
Massive Multitask Agent Understanding (MMAU) benchmark features comprehensive offline tasks that eliminate the need for complex environment setups.
It evaluates models across five domains, including Tool-use, Directed Acyclic Graph (DAG) QA, Data Science and Machine Learning coding, Contest-level programming and Mathematics.
With a total of 20 meticulously designed tasks encompassing over 3K distinct prompts, MMAU provides a comprehensive framework for evaluating the strengths and limitations of LLM agents.
arXiv Detail & Related papers (2024-07-18T00:58:41Z) - MacGyver: Are Large Language Models Creative Problem Solvers? [87.70522322728581]
We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting.
We create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems.
We present our collection to both LLMs and humans to compare and contrast their problem-solving abilities.
arXiv Detail & Related papers (2023-11-16T08:52:27Z) - MLCopilot: Unleashing the Power of Large Language Models in Solving
Machine Learning Tasks [31.733088105662876]
We aim to bridge the gap between machine intelligence and human knowledge by introducing a novel framework.
We showcase the possibility of extending the capability of LLMs to comprehend structured inputs and perform thorough reasoning for solving novel ML tasks.
arXiv Detail & Related papers (2023-04-28T17:03:57Z) - Interpretability and accessibility of machine learning in selected food
processing, agriculture and health applications [0.0]
Lack of interpretability of ML based systems is a major hindrance to widespread adoption of these powerful algorithms.
New techniques are emerging to improve ML accessibility through automated model design.
This paper provides a review of the work done to improve interpretability and accessibility of machine learning in the context of global problems.
arXiv Detail & Related papers (2022-11-30T02:44:13Z) - MLPro: A System for Hosting Crowdsourced Machine Learning Challenges for
Open-Ended Research Problems [1.3254304182988286]
We develop a system which combines the notion of open-ended ML coding problems with the concept of an automatic online code judging platform.
We find that for sufficiently unconstrained and complex problems, many experts submit similar solutions, but some experts provide unique solutions which outperform the "typical" solution class.
arXiv Detail & Related papers (2022-04-04T02:56:12Z) - Panoramic Learning with A Standardized Machine Learning Formalism [116.34627789412102]
This paper presents a standardized equation of the learning objective, that offers a unifying understanding of diverse ML algorithms.
It also provides guidance for mechanic design of new ML solutions, and serves as a promising vehicle towards panoramic learning with all experiences.
arXiv Detail & Related papers (2021-08-17T17:44:38Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Machine Learning Towards Intelligent Systems: Applications, Challenges,
and Opportunities [8.68311678910946]
Machine learning (ML) provides a mechanism for humans to process large amounts of data.
This review focuses on some of the fields and applications such as education, healthcare, network security, banking and finance, and social media.
arXiv Detail & Related papers (2021-01-11T01:32:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.