Are We Closing the Loop Yet? Gaps in the Generalizability of VIS4ML
Research
- URL: http://arxiv.org/abs/2308.06290v1
- Date: Thu, 10 Aug 2023 21:44:48 GMT
- Title: Are We Closing the Loop Yet? Gaps in the Generalizability of VIS4ML
Research
- Authors: Hariharan Subramonyam, Jessica Hullman
- Abstract summary: We survey recent VIS4ML papers to assess the generalizability of research contributions and claims in enabling human-in-the-loop ML.
Our results show potential gaps between the current scope of VIS4ML research and aspirations for its use in practice.
- Score: 26.829392755701843
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visualization for machine learning (VIS4ML) research aims to help experts
apply their prior knowledge to develop, understand, and improve the performance
of machine learning models. In conceiving VIS4ML systems, researchers
characterize the nature of human knowledge to support human-in-the-loop tasks,
design interactive visualizations to make ML components interpretable and
elicit knowledge, and evaluate the effectiveness of human-model interchange. We
survey recent VIS4ML papers to assess the generalizability of research
contributions and claims in enabling human-in-the-loop ML. Our results show
potential gaps between the current scope of VIS4ML research and aspirations for
its use in practice. We find that while papers motivate that VIS4ML systems are
applicable beyond the specific conditions studied, conclusions are often
overfitted to non-representative scenarios, are based on interactions with a
small set of ML experts and well-understood datasets, fail to acknowledge
crucial dependencies, and hinge on decisions that lack justification. We
discuss approaches to close the gap between aspirations and research claims and
suggest documentation practices to report generality constraints that better
acknowledge the exploratory nature of VIS4ML research.
Related papers
- Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) can estimate causal effects under interventions on different parts of a system.
We conduct empirical analyses to evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention.
We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types, and enable a study of intervention-based reasoning.
arXiv Detail & Related papers (2024-04-08T14:15:56Z) - Effectiveness Assessment of Recent Large Vision-Language Models [78.69439393646554]
This paper endeavors to evaluate the competency of popular large vision-language models (LVLMs) in specialized and general tasks.
We employ six challenging tasks in three different application scenarios: natural, healthcare, and industrial.
We examine the performance of three recent open-source LVLMs, including MiniGPT-v2, LLaVA-1.5, and Shikra, on both visual recognition and localization in these tasks.
arXiv Detail & Related papers (2024-03-07T08:25:27Z) - Quantitative knowledge retrieval from large language models [4.155711233354597]
Large language models (LLMs) have been extensively studied for their abilities to generate convincing natural language sequences.
This paper explores the feasibility of LLMs as a mechanism for quantitative knowledge retrieval to aid data analysis tasks.
arXiv Detail & Related papers (2024-02-12T16:32:37Z) - Exploring the Cognitive Knowledge Structure of Large Language Models: An
Educational Diagnostic Assessment Approach [50.125704610228254]
Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence.
Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains.
We conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom taxonomy.
arXiv Detail & Related papers (2023-10-12T09:55:45Z) - ExpeL: LLM Agents Are Experiential Learners [60.54312035818746]
We introduce the Experiential Learning (ExpeL) agent to allow learning from agent experiences without requiring parametric updates.
Our agent autonomously gathers experiences and extracts knowledge using natural language from a collection of training tasks.
At inference, the agent recalls its extracted insights and past experiences to make informed decisions.
arXiv Detail & Related papers (2023-08-20T03:03:34Z) - Metacognitive Prompting Improves Understanding in Large Language Models [12.112914393948415]
We introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes.
We conduct experiments on four prevalent Large Language Models (LLMs) across ten natural language understanding (NLU) datasets.
MP consistently outperforms existing prompting methods in both general and domain-specific NLU tasks.
arXiv Detail & Related papers (2023-08-10T05:10:17Z) - Visual Analytics For Machine Learning: A Data Perspective Survey [17.19676876329529]
This survey focuses on summarizing VIS4ML works from the data perspective.
We categorize the common data handled by ML models into five types, explain the unique features of each type, and highlight the corresponding ML models that are good at learning from them.
Second, from the large number of VIS4ML works, we tease out six tasks that operate on these types of data at different stages of the ML pipeline to understand, diagnose, and refine ML models.
arXiv Detail & Related papers (2023-07-15T05:13:06Z) - LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities
and Future Opportunities [68.86209486449924]
Large Language Models (LLMs) for Knowledge Graph (KG) construction and reasoning evaluated.
We propose AutoKG, a multi-agent-based approach employing LLMs and external sources for KG construction and reasoning.
arXiv Detail & Related papers (2023-05-22T15:56:44Z) - The State of the Art in Enhancing Trust in Machine Learning Models with the Use of Visualizations [0.0]
Machine learning (ML) models are nowadays used in complex applications in various domains, such as medicine, bioinformatics, and other sciences.
Due to their black box nature, however, it may sometimes be hard to understand and trust the results they provide.
This has increased the demand for reliable visualization tools related to enhancing trust in ML models.
We present a State-of-the-Art Report (STAR) on enhancing trust in ML models with the use of interactive visualization.
arXiv Detail & Related papers (2022-12-22T14:29:43Z) - Interactive Machine Learning: A State of the Art Review [0.0]
We provide a comprehensive analysis of the state-of-the-art of interactive machine learning (iML)
Research works on adversarial black-box attacks and corresponding iML based defense system, exploratory machine learning, resource constrained learning, and iML performance evaluation are analyzed.
arXiv Detail & Related papers (2022-07-13T13:43:16Z) - Understanding the Usability Challenges of Machine Learning In
High-Stakes Decision Making [67.72855777115772]
Machine learning (ML) is being applied to a diverse and ever-growing set of domains.
In many cases, domain experts -- who often have no expertise in ML or data science -- are asked to use ML predictions to make high-stakes decisions.
We investigate the ML usability challenges present in the domain of child welfare screening through a series of collaborations with child welfare screeners.
arXiv Detail & Related papers (2021-03-02T22:50:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.