Issues and Their Causes in WebAssembly Applications: An Empirical Study
- URL: http://arxiv.org/abs/2311.00646v2
- Date: Tue, 9 Apr 2024 16:56:26 GMT
- Title: Issues and Their Causes in WebAssembly Applications: An Empirical Study
- Authors: Muhammad Waseem, Teerath Das, Aakash Ahmad, Peng Liang, Tommi Mikkonen,
- Abstract summary: WebAssembly (Wasm) is a binary instruction format designed for secure and efficient execution within sandboxed environments.
In recent years, Wasm has gained significant attention from the academic research community and industrial development projects.
Despite the offered benefits, developers encounter a multitude of issues rooted in Wasm.
- Score: 5.518217604591736
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: WebAssembly (Wasm) is a binary instruction format designed for secure and efficient execution within sandboxed environments -- predominantly web apps and browsers -- to facilitate performance, security, and flexibility of web programming languages. In recent years, Wasm has gained significant attention from the academic research community and industrial development projects to engineer high-performance web applications. Despite the offered benefits, developers encounter a multitude of issues rooted in Wasm (e.g., faults, errors, failures) and are often unaware of their root causes that impact the development of web applications. To this end, we conducted an empirical study that mines and documents practitioners' knowledge expressed as 385 issues from 12 open-source Wasm projects deployed on GitHub and 354 question-answer posts via Stack Overflow. Overall, we identified 120 types of issues, which were categorized into 19 subcategories and 9 categories to create a taxonomical classification of issues encountered in Wasm-based applications. Furthermore, root cause analysis of the issues helped us identify 278 types of causes, which have been categorized into 29 subcategories and 10 categories as a taxonomy of causes. Our study led to first-of-its-kind taxonomies of the issues faced by developers and their underlying causes in Wasm-based applications. The issue-cause taxonomies -- identified from GitHub and SO, offering empirically derived guidelines -- can guide researchers and practitioners to design, develop, and refactor Wasm-based applications.
Related papers
- MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models [50.43793764203352]
We introduce MDK12-Bench, a multi-disciplinary benchmark assessing the reasoning capabilities of MLLMs via real-world K-12 examinations.
Our benchmark comprises 140K reasoning instances across diverse difficulty levels from primary school to 12th grade.
It features 6,827 instance-level knowledge point annotations based on a well-organized knowledge structure, detailed answer explanations, difficulty labels and cross-year partitions.
arXiv Detail & Related papers (2025-04-08T08:06:53Z) - The Promise and Pitfalls of WebAssembly: Perspectives from the Industry [26.4248246220256]
WebAssembly (Wasm) was proposed in 2017 and is regarded as the complementation for JavaScript.
There is no work that conducts a large-scale measurement study on in-the-wild adopted Wasm binaries.
We collect the largest-ever dataset to characterize the status quo of them from industry perspectives.
arXiv Detail & Related papers (2025-03-27T08:01:22Z) - An Empirical Investigation on the Challenges in Scientific Workflow Systems Development [2.704899832646869]
This study examines interactions between developers and researchers on Stack Overflow (SO) and GitHub.
By analyzing issues, we identified 13 topics (e.g., Errors and Bug Fixing, Documentation, Dependencies) and discovered that data structures and operations is the most difficult.
We also found common topics between SO and GitHub, such as data structures and operations, task management, and workflow scheduling.
arXiv Detail & Related papers (2024-11-16T21:14:11Z) - Practitioners' Discussions on Building LLM-based Applications for Production [6.544757635738911]
We collected 189 videos from 2022 to 2024 from practitioners actively developing large language models (LLMs)
We analyzed the transcripts using BERTopic, then manually sorted and merged the generated topics into themes, leading to a total of 20 topics in 8 themes.
The most prevalent topics fall within the theme Design & Architecture, with a strong focus on retrieval-augmented generation (RAG) systems.
arXiv Detail & Related papers (2024-11-13T12:44:41Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - How Do OSS Developers Utilize Architectural Solutions from Q&A Sites: An Empirical Study [5.568316292260523]
Developers utilize programming-related knowledge (e.g., code snippets) on Q&A sites (e.g., Stack Overflow)
architectural solutions (e.g., architecture tactics) and their utilization are rarely explored.
For the mining study, we mined 984 commits and issues (i.e., 821 commits and 163 issues) from 893 Open-Source Software (OSS) projects on GitHub.
For the survey study, we surveyed 227 of them to further understand how practitioners utilize architectural solutions from Q&A sites in their OSS development.
arXiv Detail & Related papers (2024-04-07T18:53:30Z) - A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond [84.95530356322621]
This survey presents a systematic review of the advancements in code intelligence.
It covers over 50 representative models and their variants, more than 20 categories of tasks, and an extensive coverage of over 680 related works.
Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence.
arXiv Detail & Related papers (2024-03-21T08:54:56Z) - PyRCA: A Library for Metric-based Root Cause Analysis [66.72542200701807]
PyRCA is an open-source machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps)
It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents.
arXiv Detail & Related papers (2023-06-20T09:55:10Z) - Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [100.24095818099522]
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP)
They provide a highly useful, task-agnostic foundation for a wide range of applications.
However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles.
arXiv Detail & Related papers (2023-05-30T03:00:30Z) - Understanding the Issues, Their Causes and Solutions in Microservices
Systems: An Empirical Study [11.536360998310576]
Technical Debt, Continuous Integration, Exception Handling, Service Execution and Communication are the most dominant issues in systems.
We found 177 types of solutions that can be applied to fix the identified issues.
arXiv Detail & Related papers (2023-02-03T18:08:03Z) - Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and
TensorFlow [13.260758930014154]
Deep Learning (DL) frameworks are now widely used, simplifying the creation of complex models as well as their integration to various applications even to non DL experts.
This paper deals with the subcategory of bugs named silent bugs: they lead to wrong behavior but they do not cause system crashes or hangs, nor show an error message to the user.
This paper presents the first empirical study of Keras and silent bugs, and their impact on users' programs.
arXiv Detail & Related papers (2021-12-26T04:18:57Z) - On the Social and Technical Challenges of Web Search Autosuggestion
Moderation [118.47867428272878]
Autosuggestions are typically generated by machine learning (ML) systems trained on a corpus of search logs and document representations.
While current search engines have become increasingly proficient at suppressing such problematic suggestions, there are still persistent issues that remain.
We discuss several dimensions of problematic suggestions, difficult issues along the pipeline, and why our discussion applies to the increasing number of applications beyond web search.
arXiv Detail & Related papers (2020-07-09T19:22:00Z) - A Study of Knowledge Sharing related to Covid-19 Pandemic in Stack
Overflow [69.5231754305538]
Study of 464 Stack Overflow questions posted mainly in February and March 2020 and leveraging the power of text mining.
Findings reveal that indeed this global crisis sparked off an intense and increasing activity in Stack Overflow with most post topics reflecting a strong interest on the analysis of Covid-19 data.
arXiv Detail & Related papers (2020-04-18T08:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.