Compilation of Commit Changes within Java Source Code Repositories
- URL: http://arxiv.org/abs/2407.17853v1
- Date: Thu, 25 Jul 2024 08:14:33 GMT
- Title: Compilation of Commit Changes within Java Source Code Repositories
- Authors: Stefan Schott, Wolfram Fischer, Serena Elisa Ponta, Jonas Klauke, Eric Bodden,
- Abstract summary: JESS reduces the code, retaining only those parts that the committed change references.
JESS is able to compile, in isolation, 72% of methods and constructors, of which 89% have bytecode equal to the original one.
On the Project KB database of fix-commits, in which only 8% of files modified within the commits can be compiled with the provided build scripts, JESS is able to compile 73% of all files that these commits modify.
- Score: 2.556190321164248
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Java applications include third-party dependencies as bytecode. To keep these applications secure, researchers have proposed tools to re-identify dependencies that contain known vulnerabilities. Yet, to allow such re-identification, one must obtain, for each vulnerability patch, the bytecode fixing the respective vulnerability at first. Such patches for dependencies are curated in databases in the form of fix-commits. But fixcommits are in source code, and automatically compiling whole Java projects to bytecode is notoriously hard, particularly for non-current versions of the code. In this paper, we thus propose JESS, an approach that largely avoids this problem by compiling solely the relevant code that was modified within a given commit. JESS reduces the code, retaining only those parts that the committed change references. To avoid name-resolution errors, JESS automatically infers stubs for references to entities that are unavailable to the compiler. A challenge is here that, to facilitate the above mentioned reidentification, JESS must seek to produce bytecode that is almost identical to the bytecode which one would obtain by a successful compilation of the full project. An evaluation on 347 GitHub projects shows that JESS is able to compile, in isolation, 72% of methods and constructors, of which 89% have bytecode equal to the original one. Furthermore, on the Project KB database of fix-commits, in which only 8% of files modified within the commits can be compiled with the provided build scripts, JESS is able to compile 73% of all files that these commits modify.
Related papers
- ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.
This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution [56.9361004704428]
Large Language Models (LLMs) have demonstrated remarkable proficiency across a variety of complex tasks.
SWE-Fixer is a novel open-source framework designed to effectively and efficiently resolve GitHub issues.
We assess our approach on the SWE-Bench Lite and Verified benchmarks, achieving state-of-the-art performance among open-source models.
arXiv Detail & Related papers (2025-01-09T07:54:24Z) - Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub.
83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z) - JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models [123.66104233291065]
Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content.
evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address.
JailbreakBench is an open-sourced benchmark with the following components.
arXiv Detail & Related papers (2024-03-28T02:44:02Z) - EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models [53.87416566981008]
This paper introduces EasyJailbreak, a unified framework simplifying the construction and evaluation of jailbreak attacks against Large Language Models (LLMs)
It builds jailbreak attacks using four components: Selector, Mutator, Constraint, and Evaluator.
Our validation across 10 distinct LLMs reveals a significant vulnerability, with an average breach probability of 60% under various jailbreaking attacks.
arXiv Detail & Related papers (2024-03-18T18:39:53Z) - Java JIT Testing with Template Extraction [7.714591709931207]
LeJit is a template-based framework for testing Java just-in-time (JIT) compilers.
We have successfully used LeJit to test a range of popular Java JIT compilers.
arXiv Detail & Related papers (2024-03-17T17:39:27Z) - PPT4J: Patch Presence Test for Java Binaries [15.297767260561491]
The number of vulnerabilities reported in open source software has increased substantially in recent years.
The ability to test whether a patch is applied to the target binary, a.k.a. patch presence test, is crucial for practitioners.
We propose a new patch presence test framework named PPT4J ($textbfP$atch $textbfP$resence $textbfT$est $textbffor$ $textbfJ$ava Binaries).
arXiv Detail & Related papers (2023-12-18T08:28:13Z) - PS$^3$: Precise Patch Presence Test based on Semantic Symbolic Signature [13.9637348151437]
Existing approaches mainly focus on detecting patches that are compiled in the same compiler options.
PS3 exploits symbolic emulation to extract signatures that are stable under different compiler options.
PS3 achieves scores of 0.82, 0.97, and 0.89 in terms of precision, recall, and F1 score.
arXiv Detail & Related papers (2023-12-06T10:04:15Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - On the Security Blind Spots of Software Composition Analysis [46.1389163921338]
We present a novel approach to detect vulnerable clones in the Maven repository.
We retrieve over 53k potential vulnerable clones from Maven Central.
We detect 727 confirmed vulnerable clones and synthesize a testable proof-of-vulnerability project for each of those.
arXiv Detail & Related papers (2023-06-08T20:14:46Z) - RepoCoder: Repository-Level Code Completion Through Iterative Retrieval
and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process.
It incorporates a similarity-based retriever and a pre-trained code language model.
It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z) - Automatic Specialization of Third-Party Java Dependencies [3.7973152331947815]
Large-scale code reuse significantly reduces both development costs and time.
Massive share of third-party code in software projects poses new challenges, especially in terms of maintenance and security.
We propose a novel technique to specialize dependencies of Java projects, based on their actual usage.
arXiv Detail & Related papers (2023-02-16T15:37:49Z) - Automated Mapping of Vulnerability Advisories onto their Fix Commits in
Open Source Repositories [7.629717457706326]
We present an approach that combines practical experience and machine-learning (ML)
An advisory record containing key information about a vulnerability is extracted from an advisory.
A subset of candidate fix commits is obtained from the source code repository of the affected project.
arXiv Detail & Related papers (2021-03-24T17:50:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.