The Promise and Pitfalls of WebAssembly: Perspectives from the Industry
- URL: http://arxiv.org/abs/2503.21240v1
- Date: Thu, 27 Mar 2025 08:01:22 GMT
- Title: The Promise and Pitfalls of WebAssembly: Perspectives from the Industry
- Authors: Ningyu He, Shangtong Cao, Haoyu Wang, Yao Guo, Xiapu Luo,
- Abstract summary: WebAssembly (Wasm) was proposed in 2017 and is regarded as the complementation for JavaScript.<n>There is no work that conducts a large-scale measurement study on in-the-wild adopted Wasm binaries.<n>We collect the largest-ever dataset to characterize the status quo of them from industry perspectives.
- Score: 26.4248246220256
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As JavaScript has been criticized for performance and security issues in web applications, WebAssembly (Wasm) was proposed in 2017 and is regarded as the complementation for JavaScript. Due to its advantages like compact-size, native-like speed, and portability, Wasm binaries are gradually used as the compilation target for industrial projects in other high-level programming languages and are responsible for computation-intensive tasks in browsers, e.g., 3D graphic rendering and video decoding. Intuitively, characterizing in-the-wild adopted Wasm binaries from different perspectives, like their metadata, relation with source programming language, existence of security threats, and practical purpose, is the prerequisite before delving deeper into the Wasm ecosystem and beneficial to its roadmap selection. However, currently, there is no work that conducts a large-scale measurement study on in-the-wild adopted Wasm binaries. To fill this gap, we collect the largest-ever dataset to the best of our knowledge, and characterize the status quo of them from industry perspectives. According to the different roles of people engaging in the community, i.e., web developers, Wasm maintainers, and researchers, we reorganized our findings to suggestions and best practices for them accordingly. We believe this work can shed light on the future direction of the web and Wasm.
Related papers
- MageBench: Bridging Large Multimodal Models to Agents [90.59091431806793]
LMMs have shown impressive visual understanding capabilities, with the potential to be applied in agents.
Existing benchmarks mostly assess their reasoning abilities in language part.
MageBench is a reasoning capability oriented multimodal agent benchmark.
arXiv Detail & Related papers (2024-12-05T17:08:19Z) - StackSight: Unveiling WebAssembly through Large Language Models and Neurosymbolic Chain-of-Thought Decompilation [2.1094456929188676]
StackSight visualizes and tracks virtual stack alterations via a static analysis algorithm and then applies chain-of-thought prompting.
Evaluation results show that StackSight significantly improves WebAssembly decompilation.
Our user study also demonstrates that code snippets generated by StackSight have significantly higher win rates and enable a better grasp of code semantics.
arXiv Detail & Related papers (2024-06-07T01:08:17Z) - A Survey on Large Language Models for Code Generation [9.555952109820392]
Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks.
This survey aims to bridge the gap between academia and practical development by providing a comprehensive and up-to-date literature review.
arXiv Detail & Related papers (2024-06-01T17:48:15Z) - Research on WebAssembly Runtimes: A Survey [22.031129110987017]
WebAssembly (abbreviated as Wasm) was initially introduced for the Web but quickly extended its reach into various domains beyond the Web.
This paper provides a comprehensive survey of research on WebAssembly runtimes.
arXiv Detail & Related papers (2024-04-19T04:36:38Z) - How Far Have We Gone in Binary Code Understanding Using Large Language Models [51.527805834378974]
We propose a benchmark to evaluate the effectiveness of Large Language Models (LLMs) in binary code understanding.
Our evaluations reveal that existing LLMs can understand binary code to a certain extent, thereby improving the efficiency of binary code analysis.
arXiv Detail & Related papers (2024-04-15T14:44:08Z) - WebCode2M: A Real-World Dataset for Code Generation from Webpage Designs [49.91550773480978]
This paper introduces WebCode2M, a new dataset comprising 2.56 million instances, each containing a design image along with the corresponding webpage code and layout details.
To validate the effectiveness of WebCode2M, we introduce a baseline model based on the Vision Transformer (ViT), named WebCoder, and establish a benchmark for fair comparison.
The benchmarking results demonstrate that our dataset significantly improves the ability of MLLMs to generate code from webpage designs.
arXiv Detail & Related papers (2024-04-09T15:05:48Z) - WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? [83.19032025950986]
We study the use of large language model-based agents for interacting with software via web browsers.
WorkArena is a benchmark of 33 tasks based on the widely-used ServiceNow platform.
BrowserGym is an environment for the design and evaluation of such agents.
arXiv Detail & Related papers (2024-03-12T14:58:45Z) - SoK: Analysis techniques for WebAssembly [0.0]
WebAssembly is a low-level bytecode language that allows languages like C, C++, and Rust to be executed in the browser at near-native performance.
Vulnerabilities in memory-unsafe languages, like C and C++, can translate into vulnerabilities in WebAssembly binaries.
WebAssembly has been used for malicious purposes like cryptojacking.
arXiv Detail & Related papers (2024-01-11T14:28:13Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques.
We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration [98.18244218156492]
Large Language Models (LLMs) have significantly advanced natural language processing.<n>As their applications expand into multi-agent environments, there arises a need for a comprehensive evaluation framework.<n>This work introduces a novel competition-based benchmark framework to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - Issues and Their Causes in WebAssembly Applications: An Empirical Study [5.518217604591736]
WebAssembly (Wasm) is a binary instruction format designed for secure and efficient execution within sandboxed environments.
In recent years, Wasm has gained significant attention from the academic research community and industrial development projects.
Despite the offered benefits, developers encounter a multitude of issues rooted in Wasm.
arXiv Detail & Related papers (2023-11-01T16:46:53Z) - CONCORD: Clone-aware Contrastive Learning for Source Code [64.51161487524436]
Self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE tasks.
We argue that it is also essential to factor in how developers code day-to-day for general-purpose representation learning.
In particular, we propose CONCORD, a self-supervised, contrastive learning strategy to place benign clones closer in the representation space while moving deviants further apart.
arXiv Detail & Related papers (2023-06-05T20:39:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.