Related papers: Understanding and Supporting Debugging Workflows in Multiverse Analysis

Understanding and Supporting Debugging Workflows in Multiverse Analysis

URL: http://arxiv.org/abs/2210.03804v3
Date: Sun, 4 Jun 2023 07:23:56 GMT
Title: Understanding and Supporting Debugging Workflows in Multiverse Analysis
Authors: Ken Gu, Eunice Jun, and Tim Althoff
Abstract summary: Multiverse analysis is a paradigm for statistical analysis that considers all combinations of reasonable analysis choices in parallel. Recent tools help analysts specify multiverse analyses, but they remain difficult to use in practice. We develop a command-line interface tool, Multiverse Debugger, which helps diagnose bugs in the multiverse and propagate.
Score: 12.23386451120784
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multiverse analysis, a paradigm for statistical analysis that considers all combinations of reasonable analysis choices in parallel, promises to improve transparency and reproducibility. Although recent tools help analysts specify multiverse analyses, they remain difficult to use in practice. In this work, we identify debugging as a key barrier due to the latency from running analyses to detecting bugs and the scale of metadata processing needed to diagnose a bug. To address these challenges, we prototype a command-line interface tool, Multiverse Debugger, which helps diagnose bugs in the multiverse and propagate fixes. In a qualitative lab study (n=13), we use Multiverse Debugger as a probe to develop a model of debugging workflows and identify specific challenges, including difficulty in understanding the multiverse's composition. We conclude with design implications for future multiverse analysis authoring systems.

Related papers

MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration [63.31211701741323]
We extend multi-agent multi-model reasoning to generation, specifically to improving faithfulness through refinement. We design intrinsic evaluations for each subtask, with our findings indicating that both multi-agent (multiple instances) and multi-model (diverse LLM types) approaches benefit error detection and critiquing. We consolidate these insights into a final "recipe" called Multi-Agent Multi-Model Refinement (MAMM-Refine), where multi-agent and multi-model collaboration significantly boosts performance.
arXiv Detail & Related papers (2025-03-19T14:46:53Z)
GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval [52.47514434103737]
We introduce GRITHopper-7B, a novel multi-hop dense retrieval model that achieves state-of-the-art performance. GRITHopper combines generative and representational instruction tuning by integrating causal language modeling with dense retrieval training. We find that incorporating additional context after the retrieval process, referred to as post-retrieval language modeling, enhances dense retrieval performance.
arXiv Detail & Related papers (2025-03-10T16:42:48Z)
A Multi-Task Learning Approach to Linear Multivariate Forecasting [4.369550829556578]
Recent state-of-the-art works ignore the inter-relations between divisons, using their model on each divison independently. We propose to view multivariate forecasting as a multi-task learning problem, facilitating the analysis of forecasting. We evaluate our approach on challenging benchmarks in comparison to strong baselines, and we show it obtains on-par or better results.
arXiv Detail & Related papers (2025-02-05T19:34:23Z)
ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection [107.86009509291581]
We propose ForgerySleuth to perform comprehensive clue fusion and generate segmentation outputs indicating regions that are tampered with. Our experiments demonstrate the effectiveness of ForgeryAnalysis and show that ForgerySleuth significantly outperforms existing methods in robustness, generalization, and explainability.
arXiv Detail & Related papers (2024-11-29T04:35:18Z)
Leveraging Slither and Interval Analysis to build a Static Analysis Tool [0.0]
This paper presents our progress toward finding defects that are sometimes not detected or completely detected by state-of-the-art analysis tools. We developed a working solution built on top of Slither that uses interval analysis to evaluate the contract state during the execution of each instruction.
arXiv Detail & Related papers (2024-10-31T09:28:09Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
UCB-driven Utility Function Search for Multi-objective Reinforcement Learning [75.11267478778295]
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours. We focus on the case of linear utility functions parameterised by weight vectors w. We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process.
arXiv Detail & Related papers (2024-05-01T09:34:42Z)
A Unified Debugging Approach via LLM-Based Multi-Agent Synergy [39.11825182386288]
FixAgent is an end-to-end framework for unified debug through multi-agent synergy. It significantly outperforms state-of-the-art repair methods, fixing 1.25$times$ to 2.56$times$ bugs on the repo-level benchmark, Defects4J.
arXiv Detail & Related papers (2024-04-26T04:55:35Z)
MultiDimEr: a multi-dimensional bug analyzEr [5.318531077716712]
We categorize and visualize dimensions of bug reports to identify accruing technical debt. This evidence can serve practitioners and decision makers not only as an argumentative basis for steering improvement efforts, but also as a starting point for root cause analysis.
arXiv Detail & Related papers (2024-02-16T16:00:42Z)
Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency [127.97467912117652]
Large language models (LLMs) have exhibited remarkable ability in code generation. However, generating the correct solution in a single attempt still remains a challenge. We propose the Multi-Perspective Self-Consistency (MPSC) framework incorporating both inter- and intra-consistency.
arXiv Detail & Related papers (2023-09-29T14:23:26Z)
The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models [18.026567399243]
Large Language Models (LLMs) offer a promising alternative to static analysis. In this paper, we take a deep dive into the open space of LLM-assisted static analysis. We develop LLift, a fully automated framework that interfaces with both a static analysis tool and an LLM.
arXiv Detail & Related papers (2023-08-01T02:57:43Z)
Improved Compositional Generalization by Generating Demonstrations for Meta-Learning [53.818234285773165]
We show substantially improved performance on a previously unsolved compositional behaviour split without a loss of performance on other splits. In this case, searching for relevant demonstrations even with an oracle function is not sufficient to attain good performance when using meta-learning.
arXiv Detail & Related papers (2023-05-22T14:58:54Z)
Transformer-based Multi-Aspect Modeling for Multi-Aspect Multi-Sentiment Analysis [56.893393134328996]
We propose a novel Transformer-based Multi-aspect Modeling scheme (TMM), which can capture potential relations between multiple aspects and simultaneously detect the sentiment of all aspects in a sentence. Our method achieves noticeable improvements compared with strong baselines such as BERT and RoBERTa.
arXiv Detail & Related papers (2020-11-01T11:06:31Z)
Total Deep Variation: A Stable Regularizer for Inverse Problems [71.90933869570914]
We introduce the data-driven general-purpose total deep variation regularizer. In its core, a convolutional neural network extracts local features on multiple scales and in successive blocks. We achieve state-of-the-art results for numerous imaging tasks.
arXiv Detail & Related papers (2020-06-15T21:54:15Z)
Multi-view Alignment and Generation in CCA via Consistent Latent Encoding [34.57297855115903]
Multi-view alignment is critical in many real-world multi-view applications. This paper studies multi-view alignment from the Bayesian perspective. We present Adversarial CCA (ACCA) which achieves consistent latent encodings.
arXiv Detail & Related papers (2020-05-24T10:50:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.