Related papers: Does Functional Package Management Enable Reproducible Builds at Scale? Yes

Does Functional Package Management Enable Reproducible Builds at Scale? Yes

URL: http://arxiv.org/abs/2501.15919v1
Date: Mon, 27 Jan 2025 10:11:27 GMT
Title: Does Functional Package Management Enable Reproducible Builds at Scale? Yes
Authors: Julien Malka, Stefano Zacchiroli, Théo Zimmermann,
Abstract summary: Reproducible Builds (R-B) guarantee that rebuilding a software package from source leads to bitwise identical artifacts.<n>We perform the first large-scale study of bitwise in the context of the Nix functional package manager.<n>We obtain very high bitwise rates, between 69 and 91% with an upward trend, and even higher rebuildability rates, over 99%.
Score: 4.492444446637857
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reproducible Builds (R-B) guarantee that rebuilding a software package from source leads to bitwise identical artifacts. R-B is a promising approach to increase the integrity of the software supply chain, when installing open source software built by third parties. Unfortunately, despite success stories like high build reproducibility levels in Debian packages, uncertainty remains among field experts on the scalability of R-B to very large package repositories. In this work, we perform the first large-scale study of bitwise reproducibility, in the context of the Nix functional package manager, rebuilding 709 816 packages from historical snapshots of the nixpkgs repository, the largest cross-ecosystem open source software distribution, sampled in the period 2017-2023. We obtain very high bitwise reproducibility rates, between 69 and 91% with an upward trend, and even higher rebuildability rates, over 99%. We investigate unreproducibility causes, showing that about 15% of failures are due to embedded build dates. We release a novel dataset with all build statuses, logs, as well as full ''diffoscopes'': recursive diffs of where unreproducible build artifacts differ.

Related papers

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z)
Reproducible Builds and Insights from an Independent Verifier for Arch Linux [0.0]
Supply chain attacks have emerged as a prominent cybersecurity threat in recent years.<n>Reproducible and bootstrappable builds have the potential to reduce such attacks significantly.<n>In combination with independent, exhaustive and periodic source code audits, these measures can effectively eradicate compromises in the building process.
arXiv Detail & Related papers (2025-05-27T18:14:36Z)
Canonicalization for Unreproducible Builds in Java [11.367562045401554]
We introduce a conceptual framework for reproducible builds, analyze a large dataset from Reproducible Central, and develop a novel taxonomy of six root causes of unreproducibility. We present Chains-Rebuild, a tool that raises success from 9.48% to 26.89% on 12,283 unreproducible artifacts.
arXiv Detail & Related papers (2025-04-30T14:17:54Z)
Towards Source Mapping for Zero-Knowledge Smart Contracts: Design and Preliminary Evaluation [9.952399779710044]
We present a source mapping framework that establishes traceability between Solidity source code, LLVM IR, and zkEVM bytecode within the zkSolc compilation pipeline. We evaluate the framework on a dataset of 50 benchmark contracts and 500 real-world zkSync contracts, observing a mapping accuracy of approximately 97.2% for standard Solidity constructs.
arXiv Detail & Related papers (2025-04-06T01:42:07Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models. Our framework incorporates two complementary strategies: internal TTC and external TTC. We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
ExecRepoBench: Multi-level Executable Code Completion Evaluation [45.963424627710765]
We introduce a novel framework for enhancing code completion in software development through the creation of a repository-level benchmark ExecRepoBench.<n>We present a multi-level grammar-based completion methodology conditioned on the abstract syntax tree to mask code fragments at various logical units.<n>Then, we fine-tune the open-source LLM with 7B parameters on Repo-Instruct to produce a strong code completion baseline model Qwen2.5-Coder-Instruct-C.
arXiv Detail & Related papers (2024-12-16T17:14:35Z)
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph [63.87660059104077]
We present RepoGraph, a plug-in module that manages a repository-level structure for modern AI software engineering solutions. RepoGraph substantially boosts the performance of all systems, leading to a new state-of-the-art among open-source frameworks.
arXiv Detail & Related papers (2024-10-03T05:45:26Z)
Uncovering and Mitigating the Impact of Frozen Package Versions for Fixed-Release Linux [38.53185042161599]
We study the ecosystem gap of fixed-release Linux caused by the evolution of mirrors. We propose a novel package management approach allowing for separate dependency environments based on native Debian mirrors. We present a working prototype, named ccenv, which can effectively remedy the inadequacy of current tools.
arXiv Detail & Related papers (2024-08-21T14:01:46Z)
How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE) We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories. To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z)
RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation [79.83270415843857]
We introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code documentation. We have validated the effectiveness of our approach, showing that RepoAgent excels in generating high-quality repository-level documentation.
arXiv Detail & Related papers (2024-02-26T15:39:52Z)
Reproducibility of Build Environments through Space and Time [0.0]
We argue that functional package managers provide the tooling to make build environments reproducible in space and time. We show that we are able to reproduce build environments of about 7 million Nix packages, and to rebuild 99.94% of the 14 thousand packages from a 6-year-old Nixs revision.
arXiv Detail & Related papers (2024-02-01T08:45:28Z)
Analyzing the Evolution of Inter-package Dependencies in Operating Systems: A Case Study of Ubuntu [7.76541950830141]
An Operating System (OS) combines multiple interdependent software packages, which usually have their own independently developed architectures. For an evolutionary effort, designers/developers of OS can greatly benefit from fully understanding the system-wide dependency focused on individual files. We propose a framework, DepEx, aimed at discovering the detailed package relations at the level of individual binary files.
arXiv Detail & Related papers (2023-07-10T10:12:21Z)
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process. It incorporates a similarity-based retriever and a pre-trained code language model. It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z)
S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning. It is based on a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z)
An Empirical Analysis of the R Package Ecosystem [0.0]
We analyze more than 25,000 packages, 150,000 releases, and 15 million files across two decades. We find that the historical growth of the ecosystem has been robust under all measures.
arXiv Detail & Related papers (2021-02-19T12:55:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.