Related papers: eWAPA: An eBPF-based WASI Performance Analysis Framework for WebAssembly Runtimes

eWAPA: An eBPF-based WASI Performance Analysis Framework for WebAssembly Runtimes

URL: http://arxiv.org/abs/2409.10252v1
Date: Mon, 16 Sep 2024 13:03:09 GMT
Title: eWAPA: An eBPF-based WASI Performance Analysis Framework for WebAssembly Runtimes
Authors: Chenxi Mao, Yuxin Su, Shiwen Shan, Dan Li,
Abstract summary: WebAssembly (Wasm) is a low-level bytecode format that can run in modern browsers. We propose an eBPF-based WASI performance analysis framework. It collects key performance metrics of the runtime under different I/O load conditions, such as total execution time, startup time, WASI execution time, and syscall time.
Score: 3.804314901623159
License: http://creativecommons.org/licenses/by/4.0/
Abstract: WebAssembly (Wasm) is a low-level bytecode format that can run in modern browsers. With the development of standalone runtimes and the improvement of the WebAssembly System Interface (WASI), Wasm has further provided a more complete sandboxed runtime experience for server-side applications, effectively expanding its application scenarios. However, the implementation of WASI varies across different runtimes, and suboptimal interface implementations can lead to performance degradation during interactions between the runtime and the operating system. Existing research mainly focuses on overall performance evaluation of runtimes, while studies on WASI implementations are relatively scarce. To tackle this problem, we propose an eBPF-based WASI performance analysis framework. It collects key performance metrics of the runtime under different I/O load conditions, such as total execution time, startup time, WASI execution time, and syscall time. We can comprehensively analyze the performance of the runtime's I/O interactions with the operating system. Additionally, we provide a detailed analysis of the causes behind two specific WASI performance anomalies. These analytical results will guide the optimization of standalone runtimes and WASI implementations, enhancing their efficiency.

Related papers

When Should I Run My Application Benchmark?: Studying Cloud Performance Variability for the Case of Stream Processing Applications [1.3398445165628463]
This paper empirically quantify the impact of cloud performance variability on benchmarking results. With approximately 591 hours of experiments, deploying 789 clusters on AWS and executing 2366 benchmarks, this is likely the largest study of its kind.
arXiv Detail & Related papers (2025-04-16T07:22:44Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs [81.5049387116454]
We introduce APB, an efficient long-context inference framework. APB uses multi-host approximate attention to enhance prefill speed. APB achieves speeds of up to 9.2x, 4.2x, and 1.6x compared with FlashAttn, RingAttn, and StarAttn, respectively.
arXiv Detail & Related papers (2025-02-17T17:59:56Z)
The BrowserGym Ecosystem for Web Agent Research [151.90034093362343]
BrowserGym ecosystem addresses the growing need for efficient evaluation and benchmarking of web agents. We propose an extended BrowserGym-based ecosystem for web agent research, which unifies existing benchmarks from the literature. We conduct the first large-scale, multi-benchmark web agent experiment and compare the performance of 6 state-of-the-art LLMs across 6 popular web agent benchmarks.
arXiv Detail & Related papers (2024-12-06T23:43:59Z)
Tracing Optimization for Performance Modeling and Regression Detection [15.99435412859094]
A performance model analytically describes the relationship between the performance of a system and its runtime activities. We propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions. Our approach is fully automated, making it ready to be used in production environments with minimal human effort.
arXiv Detail & Related papers (2024-11-26T16:11:55Z)
SeBS-Flow: Benchmarking Serverless Cloud Function Workflows [51.4200085836966]
We propose the first serverless workflow benchmarking suite SeBS-Flow. SeBS-Flow includes six real-world application benchmarks and four microbenchmarks representing different computational patterns. We conduct comprehensive evaluations on three major cloud platforms, assessing performance, cost, scalability, and runtime deviations.
arXiv Detail & Related papers (2024-10-04T14:52:18Z)
Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking. DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget. Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z)
Green AI: A Preliminary Empirical Study on Energy Consumption in DL Models Across Different Runtime Infrastructures [56.200335252600354]
It is common practice to deploy pre-trained models on environments distinct from their native development settings. This led to the introduction of interchange formats such as ONNX, which includes its infrastructure, and ONNX, which work as standard formats.
arXiv Detail & Related papers (2024-02-21T09:18:44Z)
A Comprehensive Trusted Runtime for WebAssembly with Intel SGX [2.6732136954707792]
We present Twine, a trusted runtime for running WebAssembly-compiled applications within TEEs. It extends the standard WebAssembly system interface (WASI), providing controlled OS services, focusing on I/O. We evaluate its performance using general-purpose benchmarks and real-world applications, showing it compares on par with state-of-the-art solutions.
arXiv Detail & Related papers (2023-12-14T16:19:00Z)
Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation [67.18144414660681]
We propose a Fast-Slow Test-Time Adaptation (FSTTA) approach for online Vision-and-Language Navigation (VLN) Our method obtains impressive performance gains on four popular benchmarks.
arXiv Detail & Related papers (2023-11-22T07:47:39Z)
Performance Tuning for GPU-Embedded Systems: Machine-Learning-based and Analytical Model-driven Tuning Methodologies [0.0]
The study introduces an analytical model-driven tuning methodology and a Machine Learning (ML)-based tuning methodology. We evaluate the performance of the two tuning methodologies for different parallel prefix implementations of the BPLG library in an NVIDIA Jetson system.
arXiv Detail & Related papers (2023-10-24T22:09:03Z)
SiamMask: A Framework for Fast Online Object Tracking and Segmentation [96.61632757952292]
SiamMask is a framework to perform both visual object tracking and video object segmentation, in real-time, with the same simple method. We show that it is possible to extend the framework to handle multiple object tracking and segmentation by simply re-using the multi-task model. It yields real-time state-of-the-art results on visual-object tracking benchmarks, while at the same time demonstrating competitive performance at a high speed for video object segmentation benchmarks.
arXiv Detail & Related papers (2022-07-05T14:47:17Z)
DeLag: Using Multi-Objective Optimization to Enhance the Detection of Latency Degradation Patterns in Service-based Systems [0.76146285961466]
We present DeLag, a novel automated search-based approach for diagnosing performance issues in service-based systems. DeLag simultaneously searches for multiple latency patterns while optimizing precision, recall and dissimilarity.
arXiv Detail & Related papers (2021-10-21T13:59:32Z)
Reproducible Performance Optimization of Complex Applications on the Edge-to-Cloud Continuum [55.6313942302582]
We propose a methodology to support the optimization of real-life applications on the Edge-to-Cloud Continuum. Our approach relies on a rigorous analysis of possible configurations in a controlled testbed environment to understand their behaviour. Our methodology can be generalized to other applications in the Edge-to-Cloud Continuum.
arXiv Detail & Related papers (2021-08-04T07:35:14Z)
IOHanalyzer: Detailed Performance Analyses for Iterative Optimization Heuristics [3.967483941966979]
IOHanalyzer is a new user-friendly tool for the analysis, comparison, and visualization of performance data of IOHs. IOHanalyzer provides detailed statistics about fixed-target running times and about fixed-budget performance of the benchmarked algorithms. IOHanalyzer can directly process performance data from the main benchmarking platforms.
arXiv Detail & Related papers (2020-07-08T08:20:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.