Related papers: LibAM: An Area Matching Framework for Detecting Third-party Libraries in Binaries

LibAM: An Area Matching Framework for Detecting Third-party Libraries in Binaries

URL: http://arxiv.org/abs/2305.04026v3
Date: Tue, 12 Sep 2023 06:51:56 GMT
Title: LibAM: An Area Matching Framework for Detecting Third-party Libraries in Binaries
Authors: Siyuan Li, Yongpan Wang, Chaopeng Dong, Shouguo Yang, Hong Li, Hao Sun, Zhe Lang, Zuxin Chen, Weijie Wang, Hongsong Zhu, Limin Sun
Abstract summary: Third-party libraries (TPLs) are utilized by developers to expedite the software development process and incorporate external functionalities. Insecure TPL reuse can lead to significant security risks. We introduce LibAM, a novel Area Matching framework that connects isolated functions into function areas on Function Call Graph.
Score: 28.877355564114904
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Third-party libraries (TPLs) are extensively utilized by developers to expedite the software development process and incorporate external functionalities. Nevertheless, insecure TPL reuse can lead to significant security risks. Existing methods are employed to determine the presence of TPL code in the target binary. Existing methods, which involve extracting strings or conducting function matching, are employed to determine the presence of TPL code in the target binary. However, these methods often yield unsatisfactory results due to the recurrence of strings and the presence of numerous similar non-homologous functions. Additionally, they struggle to identify specific pieces of reused code in the target binary, complicating the detection of complex reuse relationships and impeding downstream tasks. In this paper, we observe that TPL reuse typically involves not just isolated functions but also areas encompassing several adjacent functions on the Function Call Graph (FCG). We introduce LibAM, a novel Area Matching framework that connects isolated functions into function areas on FCG and detects TPLs by comparing the similarity of these function areas. Furthermore, LibAM is the first approach capable of detecting the exact reuse areas on FCG and offering substantial benefits for downstream tasks. Experimental results demonstrate that LibAM outperforms all existing TPL detection methods and provides interpretable evidence for TPL detection results by identifying exact reuse areas. We also evaluate LibAM's accuracy on large-scale, real-world binaries in IoT firmware and generate a list of potential vulnerabilities for these devices. Last but not least, by analyzing the detection results of IoT firmware, we make several interesting findings, such as different target binaries always tend to reuse the same code area of TPL.

Related papers

BinCoFer: Three-Stage Purification for Effective C/C++ Binary Third-Party Library Detection [3.406168883492101]
Third-party libraries (TPL) are becoming increasingly popular to achieve efficient and concise software development. unregulated use of TPL will introduce legal and security issues in software development. BinCoFer is a tool designed for detecting TPLs reused in binary programs.
arXiv Detail & Related papers (2025-04-28T07:57:42Z)
Version-level Third-Party Library Detection in Android Applications via Class Structural Similarity [3.8381968290928596]
We propose SAD, a TPL detection tool with high version-level detection performance. SAD achieves F1 scores of 97.64% and 84.82% for library-level and version-level detection on obfuscated apps.
arXiv Detail & Related papers (2025-04-18T08:24:32Z)
Enhancing Security in Third-Party Library Reuse -- Comprehensive Detection of 1-day Vulnerability through Code Patch Analysis [8.897599530972638]
Thirdparty libraries (TPLs) can introduce vulnerabilities (known as 1-day vulnerabilities) because of the low maintenance of TPLs. VULTURE aims at identifying 1-day vulnerabilities that arise from the reuse of vulnerable TPLs. VULTURE successfully identified 175 vulnerabilities from 178 reused TPLs.
arXiv Detail & Related papers (2024-11-29T12:02:28Z)
Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction. Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features. RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z)
Bi-Directional Transformers vs. word2vec: Discovering Vulnerabilities in Lifted Compiled Code [4.956066467858057]
This research explores vulnerability detection using natural language processing (NLP) embedding techniques with word2vec, BERT, and RoBERTa. Long short-term memory (LSTM) neural networks were trained on embeddings from encoders created using approximately 48k LLVM functions from the Juliet dataset.
arXiv Detail & Related papers (2024-05-31T03:57:19Z)
Cross-domain Learning Framework for Tracking Users in RIS-aided Multi-band ISAC Systems with Sparse Labeled Data [55.70071704247794]
Integrated sensing and communications (ISAC) is pivotal for 6G communications and is boosted by the rapid development of reconfigurable intelligent surfaces (RISs) This paper proposes the X2Track framework, where we model the tracking function by a hierarchical architecture, jointly utilizing multi-modal CSI indicators across multiple bands, and optimize it in a cross-domain manner. Under X2Track, we design an efficient deep learning algorithm to minimize tracking errors, based on transformer neural networks and adversarial learning techniques.
arXiv Detail & Related papers (2024-05-10T08:04:27Z)
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries. We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language. We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z)
Cross-Inlining Binary Function Similarity Detection [16.923959153965857]
We propose a pattern-based model named CI-Detector for cross-inlining matching. Results show that CI-Detector can detect cross-inlining pairs with a precision of 81% and a recall of 97%, which exceeds all state-of-the-art works.
arXiv Detail & Related papers (2024-01-11T08:42:08Z)
Exploring Incompatible Knowledge Transfer in Few-shot Image Generation [107.81232567861117]
Few-shot image generation learns to generate diverse and high-fidelity images from a target domain using a few reference samples. Existing F SIG methods select, preserve and transfer prior knowledge from a source generator to learn the target generator. We propose knowledge truncation, which is a complementary operation to knowledge preservation and is implemented by a lightweight pruning-based method.
arXiv Detail & Related papers (2023-04-15T14:57:15Z)
Benchmarking Deep Models for Salient Object Detection [67.07247772280212]
We construct a general SALient Object Detection (SALOD) benchmark to conduct a comprehensive comparison among several representative SOD methods. In the above experiments, we find that existing loss functions usually specialized in some metrics but reported inferior results on the others. We propose a novel Edge-Aware (EA) loss that promotes deep networks to learn more discriminative features by integrating both pixel- and image-level supervision signals.
arXiv Detail & Related papers (2022-02-07T03:43:16Z)
A Cascaded Zoom-In Network for Patterned Fabric Defect Detection [8.789819609485225]
We propose a two-step Cascaded Zoom-In Network (CZI-Net) for patterned fabric defect detection. In the CZI-Net, the Aggregated HOG (A-HOG) and SIFT features are used to instead of simple convolution filters for feature extraction. Experiments based on real-world datasets are implemented and demonstrate that our proposed method is not only computationally simple but also with high detection accuracy.
arXiv Detail & Related papers (2021-08-15T15:29:26Z)
On using distributed representations of source code for the detection of C security vulnerabilities [14.8831988481175]
This paper presents an evaluation of the code representation model Code2vec when trained on the task of detecting security vulnerabilities in C source code. We leverage the open-source library astminer to extract path-contexts from the abstract syntax trees of a corpus of labeled C functions. Code2vec is trained on the resulting path-contexts with the task of classifying a function as vulnerable or non-vulnerable.
arXiv Detail & Related papers (2021-06-01T21:18:23Z)
LabelEnc: A New Intermediate Supervision Method for Object Detection [78.74368141062797]
We propose a new intermediate supervision method, named LabelEnc, to boost the training of object detection systems. The key idea is to introduce a novel label encoding function, mapping the ground-truth labels into latent embedding. Experiments show our method improves a variety of detection systems by around 2% on COCO dataset.
arXiv Detail & Related papers (2020-07-07T08:55:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.