Related papers: An Empirical Study of Bugs in Data Visualization Libraries

An Empirical Study of Bugs in Data Visualization Libraries

URL: http://arxiv.org/abs/2506.15084v1
Date: Wed, 18 Jun 2025 02:49:09 GMT
Title: An Empirical Study of Bugs in Data Visualization Libraries
Authors: Weiqi Lu, Yongqiang Tian, Xiaohan Zhong, Haoyang Ma, Zhenyang Xu, Shing-Chi Cheung, Chengnian Sun,
Abstract summary: This study presents the first comprehensive analysis of bugs in DataViz libraries, examining 564 bugs collected from five widely-used libraries.<n>We found that incorrect/inaccurate plots are pervasive in DataViz libraries and incorrect graphic computation is the major root cause.<n>We identified eight key steps to trigger such bugs and two test oracles specific to DataViz libraries, which may inspire future research in designing effective automated testing techniques.
Score: 7.75848222829394
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Data visualization (DataViz) libraries play a crucial role in presentation, data analysis, and application development, underscoring the importance of their accuracy in transforming data into visual representations. Incorrect visualizations can adversely impact user experience, distort information conveyance, and influence user perception and decision-making processes. Visual bugs in these libraries can be particularly insidious as they may not cause obvious errors like crashes, but instead mislead users of the underlying data graphically, resulting in wrong decision making. Consequently, a good understanding of the unique characteristics of bugs in DataViz libraries is essential for researchers and developers to detect and fix bugs in DataViz libraries. This study presents the first comprehensive analysis of bugs in DataViz libraries, examining 564 bugs collected from five widely-used libraries. Our study systematically analyzes their symptoms and root causes, and provides a detailed taxonomy. We found that incorrect/inaccurate plots are pervasive in DataViz libraries and incorrect graphic computation is the major root cause, which necessitates further automated testing methods for DataViz libraries. Moreover, we identified eight key steps to trigger such bugs and two test oracles specific to DataViz libraries, which may inspire future research in designing effective automated testing techniques. Furthermore, with the recent advancements in Vision Language Models (VLMs), we explored the feasibility of applying these models to detect incorrect/inaccurate plots. The results show that the effectiveness of VLMs in bug detection varies from 29% to 57%, depending on the prompts, and adding more information in prompts does not necessarily increase the effectiveness. More findings can be found in our manuscript.

Related papers

Subgraph-Oriented Testing for Deep Learning Libraries [9.78188667672054]
We propose SORT (Subgraph-Oriented Realistic Testing) to test Deep Learning (DL) libraries on different hardware platforms.<n>SORT takes popular API interaction patterns, represented as frequent subgraphs of model graphs, as test subjects.<n>SORT achieves a 100% valid input generation rate, detects more precision bugs than existing methods, and reveals interaction-related bugs missed by single-API testing.
arXiv Detail & Related papers (2024-12-09T12:10:48Z)
An Empirical Study of API Misuses of Data-Centric Libraries [9.667988837321943]
This paper contributes an empirical study of API misuses of five data-centric libraries that cover areas such as data processing, numerical computation, machine learning, and visualization. We identify misuses of these libraries by analyzing data from both Stack Overflow and GitHub.
arXiv Detail & Related papers (2024-08-28T15:15:52Z)
VDebugger: Harnessing Execution Feedback for Debugging Visual Programs [103.61860743476933]
We introduce V Debugger, a critic-refiner framework trained to localize and debug visual programs by tracking execution step by step. V Debugger identifies and corrects program errors leveraging detailed execution feedback, improving interpretability and accuracy. Evaluations on six datasets demonstrate V Debugger's effectiveness, showing performance improvements of up to 3.2% in downstream task accuracy.
arXiv Detail & Related papers (2024-06-19T11:09:16Z)
GraphGuard: Detecting and Counteracting Training Data Misuse in Graph Neural Networks [69.97213941893351]
The emergence of Graph Neural Networks (GNNs) in graph data analysis has raised critical concerns about data misuse during model training. Existing methodologies address either data misuse detection or mitigation, and are primarily designed for local GNN models. This paper introduces a pioneering approach called GraphGuard, to tackle these challenges.
arXiv Detail & Related papers (2023-12-13T02:59:37Z)
Software issues report for bug fixing process: An empirical study of machine-learning libraries [0.0]
We investigated the effectiveness of issue resolution for bug-fixing processes in six machine-learning libraries. The most common categories of issues that arise in machine-learning libraries are bugs, documentation, optimization, crashes, enhancement, new feature requests, build/CI, support, and performance. This study concludes that efficient issue-tracking processes, effective communication, and collaboration are vital for effective resolution of issues and bug fixing processes in machine-learning libraries.
arXiv Detail & Related papers (2023-12-10T21:33:19Z)
On Using GUI Interaction Data to Improve Text Retrieval-based Bug Localization [10.717184444794505]
We investigate the hypothesis that, for end user-facing applications, connecting information in a bug report with information from the GUI, can improve upon existing techniques for bug localization. We source the current largest dataset of fully-localized and reproducible real bugs for Android apps, with corresponding bug reports.
arXiv Detail & Related papers (2023-10-12T07:14:22Z)
An Empirical Study on Bugs Inside PyTorch: A Replication Study [10.848682558737494]
We characterize bugs in the PyTorch library, a very popular deep learning framework. Our results highlight that PyTorch bugs are more like traditional software projects bugs, than related to deep learning characteristics.
arXiv Detail & Related papers (2023-07-25T19:23:55Z)
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future [63.99570204416711]
We reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets. We define a uniform evaluation setup including a new formalization of the annotation error detection task. We release our datasets and implementations in an easy-to-use and open source software package.
arXiv Detail & Related papers (2022-06-05T22:31:45Z)
Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors [105.12462629663757]
In this work, we aggregate factuality error annotations from nine existing datasets and stratify them according to the underlying summarization model. We compare performance of state-of-the-art factuality metrics, including recent ChatGPT-based metrics, on this stratified benchmark and show that their performance varies significantly across different types of summarization models.
arXiv Detail & Related papers (2022-05-25T15:26:48Z)
DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem. The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network. To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z)
Hidden Biases in Unreliable News Detection Datasets [60.71991809782698]
We show that selection bias during data collection leads to undesired artifacts in the datasets. We observed a significant drop (>10%) in accuracy for all models tested in a clean split with no train/test source overlap. We suggest future dataset creation include a simple model as a difficulty/bias probe and future model development use a clean non-overlapping site and date split.
arXiv Detail & Related papers (2021-04-20T17:16:41Z)
Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious. We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.