Related papers: Does Using Bazel Help Speed Up Continuous Integration Builds?

Does Using Bazel Help Speed Up Continuous Integration Builds?

URL: http://arxiv.org/abs/2405.00796v1
Date: Wed, 1 May 2024 18:16:38 GMT
Title: Does Using Bazel Help Speed Up Continuous Integration Builds?
Authors: Shenyu Zheng, Bram Adams, Ahmed E. Hassan,
Abstract summary: New artifact-based build technologies like Bazel have built-in support for advanced performance optimizations. We collected 383 Bazel projects from GitHub, studied their parallel and incremental build usage of Bazel in 4 popular CI services, and compared the results with Maven projects. Our results show that 31.23% of Bazel projects adopt a CI service but do not use it in the CI service, while for those who do use Bazel in CI, 27.76% of them use other tools to facilitate Bazel's execution.
Score: 9.098224117917336
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A long continuous integration (CI) build forces developers to wait for CI feedback before starting subsequent development activities, leading to time wasted. In addition to a variety of build scheduling and test selection heuristics studied in the past, new artifact-based build technologies like Bazel have built-in support for advanced performance optimizations such as parallel build and incremental build (caching of build results). However, little is known about the extent to which new build technologies like Bazel deliver on their promised benefits, especially for long-build duration projects. In this study, we collected 383 Bazel projects from GitHub, then studied their parallel and incremental build usage of Bazel in 4 popular CI services, and compared the results with Maven projects. We conducted 3,500 experiments on 383 Bazel projects and analyzed the build logs of a subset of 70 buildable projects to evaluate the performance impact of Bazel's parallel builds. Additionally, we performed 102,232 experiments on the 70 buildable projects' last 100 commits to evaluate Bazel's incremental build performance. Our results show that 31.23% of Bazel projects adopt a CI service but do not use Bazel in the CI service, while for those who do use Bazel in CI, 27.76% of them use other tools to facilitate Bazel's execution. Compared to sequential builds, the median speedups for long-build duration projects are 2.00x, 3.84x, 7.36x, and 12.80x, at parallelism degrees 2, 4, 8, and 16, respectively, even though, compared to a clean build, applying incremental build achieves a median speedup of 4.22x (with a build system tool-independent CI cache) and 4.71x (with a build system tool-specific cache) for long-build duration projects. Our results provide guidance for developers to improve the usage of Bazel in their projects.

Related papers

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z)
Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity [105.54207710201183]
MoGE constrains tokens to activate an equal number of experts within each predefined expert group.<n>Pangu Pro MoE achieves 1148 tokens/s per card and can be further improved to 1528 tokens/s per card by speculative acceleration.
arXiv Detail & Related papers (2025-05-27T16:40:21Z)
Attestable builds: compiling verifiable binaries on untrusted systems using trusted execution environments [3.207381224848367]
attestable builds provide strong source-to-binary correspondence in software artifacts.<n>We tackle the challenge of opaque build pipelines that disconnect the trust between source code and the final binary artifact.
arXiv Detail & Related papers (2025-05-05T10:00:04Z)
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories. PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files. We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
CLOVER: A Test Case Generation Benchmark with Coverage, Long-Context, and Verification [71.34070740261072]
This paper presents a benchmark, CLOVER, to evaluate models' capabilities in generating and completing test cases. The benchmark is containerized for code execution across tasks, and we will release the code, data, and construction methodologies.
arXiv Detail & Related papers (2025-02-12T21:42:56Z)
You Name It, I Run It: An LLM Agent to Execute Tests of Arbitrary Projects [18.129031749321058]
ExecutionAgent is an automated technique that prepares scripts for building an arbitrary project from source code and running its test cases. Our evaluation applies ExecutionAgent to 50 open-source projects that use 14 different programming languages and many different build and testing tools.
arXiv Detail & Related papers (2024-12-13T13:30:51Z)
Local Software Buildability across Java Versions (Registered Report) [0.0]
We will try to automatically build every project in containers with Java versions 6 to 23 installed. Success or failure will be determined by exit codes, and standard output and error streams will be saved.
arXiv Detail & Related papers (2024-08-21T11:51:00Z)
PruningBench: A Comprehensive Benchmark of Structural Pruning [50.23493036025595]
We present the first comprehensive benchmark, termed textitPruningBench, for structural pruning. PruningBench employs a unified and consistent framework for evaluating the effectiveness of diverse structural pruning techniques. It provides easily implementable interfaces to facilitate the implementation of future pruning methods, and enables the subsequent researchers to incorporate their work into our leaderboards.
arXiv Detail & Related papers (2024-06-18T06:37:26Z)
Long Code Arena: a Set of Benchmarks for Long-Context Code Models [75.70507534322336]
Long Code Arena is a suite of six benchmarks for code processing tasks that require project-wide context. These tasks cover different aspects of code processing: library-based code generation, CI builds repair, project-level code completion, commit message generation, bug localization, and module summarization. For each task, we provide a manually verified dataset for testing, an evaluation suite, and open-source baseline solutions.
arXiv Detail & Related papers (2024-06-17T14:58:29Z)
Detecting Build Dependency Errors in Incremental Builds [13.823208277774572]
We propose EChecker to detect build dependency errors in the context of incremental builds. EChecker automatically updates actual build dependencies by inferring them from C/C++ pre-processor directives and Makefile changes from new commits. EChecker increases the build dependency error detection efficiency by an average of 85.14 times.
arXiv Detail & Related papers (2024-04-20T07:01:11Z)
Reproducibility of Build Environments through Space and Time [0.0]
We argue that functional package managers provide the tooling to make build environments reproducible in space and time. We show that we are able to reproduce build environments of about 7 million Nix packages, and to rebuild 99.94% of the 14 thousand packages from a 6-year-old Nixs revision.
arXiv Detail & Related papers (2024-02-01T08:45:28Z)
DevEval: Evaluating Code Generation in Practical Software Projects [52.16841274646796]
We propose a new benchmark named DevEval, aligned with Developers' experiences in practical projects. DevEval is collected through a rigorous pipeline, containing 2,690 samples from 119 practical projects. We assess five popular LLMs on DevEval and reveal their actual abilities in code generation.
arXiv Detail & Related papers (2024-01-12T06:51:30Z)
Mixtral of Experts [57.411379935325435]
Mixtral 8x7B is a Sparse Mixture of Experts (SMoE) language model. Mixtral vastly outperforms Llama 2 70B on mathematics, code generation, and multilingual benchmarks. We also provide a model fine-tuned to follow instructions, Mixtral 8x7B - Instruct, that surpasses GPT-3.5 Turbo, Claude-2.1, Gemini Pro, and Llama 2 70B - chat model on human benchmarks.
arXiv Detail & Related papers (2024-01-08T18:47:34Z)
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding [58.20031627237889]
LongBench is the first bilingual, multi-task benchmark for long context understanding. It comprises 21 datasets across 6 task categories in both English and Chinese, with an average length of 6,711 words (English) and 13,386 characters (Chinese)
arXiv Detail & Related papers (2023-08-28T11:53:40Z)
Hexatagging: Projective Dependency Parsing as Tagging [63.5392760743851]
We introduce a novel dependency, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags. Our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other. We achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set.
arXiv Detail & Related papers (2023-06-08T18:02:07Z)
The Impact of a Continuous Integration Service on the Delivery Time of Merged Pull Requests [8.108605385023939]
We study whether adopting a CI service (TravisCI) can quicken the time to deliver merged PRs. Our results reveal that adopting a CI service may not necessarily quicken the delivery of merge PRs. The automation provided by CI and the boost in developers' confidence are key advantages of adopting a CI service.
arXiv Detail & Related papers (2023-05-25T10:59:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.