Related papers: How Far Have We Gone in Binary Code Understanding Using Large Language Models

How Far Have We Gone in Binary Code Understanding Using Large Language Models

URL: http://arxiv.org/abs/2404.09836v3
Date: Thu, 24 Oct 2024 14:35:43 GMT
Title: How Far Have We Gone in Binary Code Understanding Using Large Language Models
Authors: Xiuwei Shang, Shaoyin Cheng, Guoqiang Chen, Yanming Zhang, Li Hu, Xiao Yu, Gangyang Li, Weiming Zhang, Nenghai Yu,
Abstract summary: We propose a benchmark to evaluate the effectiveness of Large Language Models (LLMs) in binary code understanding. Our evaluations reveal that existing LLMs can understand binary code to a certain extent, thereby improving the efficiency of binary code analysis.
Score: 51.527805834378974
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Binary code analysis plays a pivotal role in various software security applications, such as software maintenance, malware detection, software vulnerability discovery, patch analysis, etc. However, unlike source code, understanding binary code is challenging for reverse engineers due to the absence of semantic information. Therefore, automated tools are needed to assist human players in interpreting binary code. In recent years, two groups of technologies have shown promising prospects: (1) Deep learning-based technologies have demonstrated competitive results in tasks related to binary code understanding, furthermore, (2) Large Language Models (LLMs) have been extensively pre-trained at the source-code level for tasks such as code understanding and generation. This makes participants wonder about the ability of LLMs in binary code understanding. In this work, we propose a benchmark to evaluate the effectiveness of LLMs in real-world reverse engineering scenarios. The benchmark covers two key binary code understanding tasks, including function name recovery and binary code summarization. We gain valuable insights into their capabilities and limitations through extensive evaluations of popular LLMs using our benchmark. Our evaluations reveal that existing LLMs can understand binary code to a certain extent, thereby improving the efficiency of binary code analysis. Our results highlight the great potential of the LLMs in advancing the field of binary code understanding.

Related papers

IFEvalCode: Controlled Code Generation [69.28317223249358]
The paper introduces forward and backward constraints generation to improve the instruction-following capabilities of Code LLMs.<n>The authors present IFEvalCode, a multilingual benchmark comprising 1.6K test samples across seven programming languages.
arXiv Detail & Related papers (2025-07-30T08:08:48Z)
Is Compression Really Linear with Code Intelligence? [60.123628177110206]
textitFormat Annealing is a lightweight, transparent training methodology designed to assess the intrinsic capabilities of pre-trained models equitably.<n>Our empirical results reveal a fundamental logarithmic relationship between measured code intelligence and bits-per-character (BPC)<n>Our work provides a more nuanced understanding of compression's role in developing code intelligence and contributes a robust evaluation framework in the code domain.
arXiv Detail & Related papers (2025-05-16T16:59:14Z)
BinMetric: A Comprehensive Binary Analysis Benchmark for Large Language Models [50.17907898478795]
We introduce BinMetric, a benchmark designed to evaluate the performance of large language models on binary analysis tasks.<n>BinMetric comprises 1,000 questions derived from 20 real-world open-source projects across 6 practical binary analysis tasks.<n>Our empirical study on this benchmark investigates the binary analysis capabilities of various state-of-the-art LLMs, revealing their strengths and limitations in this field.
arXiv Detail & Related papers (2025-05-12T08:54:07Z)
An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding [50.17907898478795]
This work proposes a benchmark to evaluate the effectiveness of Large Language Models (LLMs) in real-world reverse engineering scenarios. Our evaluations reveal that existing LLMs can understand binary code to a certain extent, thereby improving the efficiency of binary code analysis.
arXiv Detail & Related papers (2025-04-30T17:02:06Z)
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems. While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited. We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z)
Crystal: Illuminating LLM Abilities on Language and Code [58.5467653736537]
We propose a pretraining strategy to enhance the integration of natural language and coding capabilities. The resulting model, Crystal, demonstrates remarkable capabilities in both domains.
arXiv Detail & Related papers (2024-11-06T10:28:46Z)
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions. We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types. We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z)
Unsupervised Binary Code Translation with Application to Code Similarity Detection and Vulnerability Discovery [2.022692275087205]
Cross-architecture binary code analysis has become an emerging problem. Deep learning-based binary analysis has shown promising success. For some low-resource ISAs, an adequate amount of data is hard to find.
arXiv Detail & Related papers (2024-04-29T18:09:28Z)
CLAP: Learning Transferable Binary Code Representations with Natural Language Supervision [22.42846252594693]
We present CLAP (Contrastive Language-Assembly Pre-training), which employs natural language supervision to learn better representations of binary code. At the core, our approach boosts superior transfer learning capabilities by effectively aligning binary code with their semantics explanations. We have generated 195 million pairs of binary code and explanations and trained a prototype of CLAP.
arXiv Detail & Related papers (2024-02-26T13:49:52Z)
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code) Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z)
Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models [37.8941430624661]
This study delves into the potential of large language models (LLMs) for binary code comprehension. We present BinSum, a comprehensive benchmark and dataset of over 557K binary functions. We also propose a new semantic similarity metric that surpasses traditional exact-match approaches.
arXiv Detail & Related papers (2023-12-15T08:32:28Z)
CP-BCS: Binary Code Summarization Guided by Control Flow Graph and Pseudo Code [79.87518649544405]
We present a control flow graph and pseudo code guided binary code summarization framework called CP-BCS. CP-BCS utilizes a bidirectional instruction-level control flow graph and pseudo code that incorporates expert knowledge to learn the comprehensive binary function execution behavior and logic semantics.
arXiv Detail & Related papers (2023-10-24T14:20:39Z)
Leveraging Artificial Intelligence on Binary Code Comprehension [5.236023714727536]
We propose to develop Artificial Intelligence (AI) models that aid human comprehension of binary code. Specifically, we propose to incorporate domain knowledge from large corpora of source code (e.g., variable names, comments) to build AI models that capture a generalizable representation of binary code. Lastly, we will investigate metrics to assess the performance of models that apply to binary code by using human studies of comprehension.
arXiv Detail & Related papers (2022-10-11T02:39:29Z)
Pre-Training Representations of Binary Code Using Contrastive Learning [13.570375923483452]
We propose a COntrastive learning Model for Binary cOde Analysis, or COMBO, that incorporates source code and comment information into binary code during representation learning. COMBO is the first language representation model that incorporates source code, binary code, and comments into contrastive code representation learning.
arXiv Detail & Related papers (2022-10-11T02:39:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.