Pack-A-Mal: A Malware Analysis Framework for Open-Source Packages
- URL: http://arxiv.org/abs/2511.09957v1
- Date: Fri, 14 Nov 2025 01:21:31 GMT
- Title: Pack-A-Mal: A Malware Analysis Framework for Open-Source Packages
- Authors: Duc-Ly Vu, Thanh-Cong Nguyen, Minh-Khanh Vu, Ngoc-Thanh Nguyen, Kim-Anh Do Thi,
- Abstract summary: This paper highlights that dynamic analysis, rather than static analysis, provides greater insight but is also more resource-intensive for understanding software behaviour during execution.<n>We enhance a dynamic analysis tool, package-analysis, to capture key runtime behaviours, including commands executed, files accessed, and network communications.<n>This modification enables the use of container sandboxing technologies, such as gVisor, to analyse potentially malicious packages without significantly compromising the host system.
- Score: 2.6686157733529847
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The increasingly sophisticated environment in which attackers operate makes software security an even greater challenge in open-source projects, where malicious packages are prevalent. Static analysis tools, such as Malcontent, are highly useful but are often incapable of dealing with obfuscated malware. Such situations lead to an unreasonably high rate of false positives. This paper highlights that dynamic analysis, rather than static analysis, provides greater insight but is also more resource-intensive for understanding software behaviour during execution. In this study, we enhance a dynamic analysis tool, package-analysis, to capture key runtime behaviours, including commands executed, files accessed, and network communications. This modification enables the use of container sandboxing technologies, such as gVisor, to analyse potentially malicious packages without significantly compromising the host system.
Related papers
- Multi-Agent Taint Specification Extraction for Vulnerability Detection [49.27772068704498]
Static Application Security Testing (SAST) tools using taint analysis are widely viewed as providing higher-quality vulnerability detection results.<n>We present SemTaint, a multi-agent system that strategically combines the semantic understanding of Large Language Models (LLMs) with traditional static program analysis.<n>We integrate SemTaint with CodeQL, a state-of-the-art SAST tool, and demonstrate its effectiveness by detecting 106 of 162 vulnerabilities previously undetectable by CodeQL.
arXiv Detail & Related papers (2026-01-15T21:31:51Z) - CHASE: LLM Agents for Dissecting Malicious PyPI Packages [2.384873896423002]
Large Language Models (LLMs) offer promising capabilities for automated code analysis.<n>Their application to security-critical malware detection faces fundamental challenges, including hallucination and context confusion.<n>We present CHASE, a high-reliability multi-agent architecture that addresses these limitations.
arXiv Detail & Related papers (2026-01-11T10:06:14Z) - Towards Classifying Benign And Malicious Packages Using Machine Learning [2.8630136355252582]
Malicious open-source package detection typically requires static, dynamic analysis, or both.<n>Current dynamic analysis tools lack an automatic method to differentiate malicious packages from benign packages.<n>We propose an approach to extract the features from dynamic analysis (e.g., executed commands) and leverage machine learning techniques to automatically classify packages as benign or malicious.
arXiv Detail & Related papers (2025-11-19T01:59:11Z) - Cuckoo Attack: Stealthy and Persistent Attacks Against AI-IDE [64.47951172662745]
Cuckoo Attack is a novel attack that achieves stealthy and persistent command execution by embedding malicious payloads into configuration files.<n>We formalize our attack paradigm into two stages, including initial infection and persistence.<n>We contribute seven actionable checkpoints for vendors to evaluate their product security.
arXiv Detail & Related papers (2025-09-19T04:10:52Z) - Certifiably robust malware detectors by design [48.367676529300276]
We propose a new model architecture for robust malware detection by design.<n>We show that every robust detector can be decomposed into a specific structure, which can be applied to learn empirically robust malware detectors.<n>Our framework ERDALT is based on this structure.
arXiv Detail & Related papers (2025-08-10T09:19:29Z) - Beyond the Surface: An NLP-based Methodology to Automatically Estimate CVE Relevance for CAPEC Attack Patterns [42.63501759921809]
We propose a methodology leveraging Natural Language Processing (NLP) to associate Common Vulnerabilities and Exposure (CAPEC) vulnerabilities with Common Attack Patternion and Classification (CAPEC) attack patterns.<n> Experimental evaluations demonstrate superior performance compared to state-of-the-art models.
arXiv Detail & Related papers (2025-01-13T08:39:52Z) - Exploring Large Language Models for Semantic Analysis and Categorization of Android Malware [0.0]
msp is designed to augment malware analysis for Android through a hierarchical-tiered summarization chain and strategic prompt engineering.<n>msp can achieve up to 77% classification accuracy while providing highly robust summaries at functional, class, and package levels.
arXiv Detail & Related papers (2025-01-08T21:22:45Z) - Fakeium: A Dynamic Execution Environment for JavaScript Program Analysis [3.7980955101286322]
Fakeium is a novel, open source, and lightweight execution environment designed for efficient, large-scale dynamic analysis of JavaScript programs.
Fakeium complements traditional static analysis by providing additional API calls and string literals.
Fakeium's flexibility and ability to detect hidden API calls, especially in obfuscated sources, highlights its potential as a valuable tool for security analysts to detect malicious behavior.
arXiv Detail & Related papers (2024-10-28T09:27:26Z) - The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition.
Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies.
We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z) - JITScanner: Just-in-Time Executable Page Check in the Linux Operating System [6.725792100548271]
JITScanner is developed as a Linux-oriented package built upon a Loadable Kernel Module (LKM)
It integrates a user-level component that communicates efficiently with the LKM using scalable multi-processor/core technology.
JITScanner's effectiveness in detecting malware programs and its minimal intrusion in normal runtime scenarios have been extensively tested.
arXiv Detail & Related papers (2024-04-25T17:00:08Z) - Unveiling the Invisible: Detection and Evaluation of Prototype Pollution Gadgets with Dynamic Taint Analysis [4.8966278983718405]
This paper proposes Dasty, the first semi-automated pipeline to help developers identify gadgets in their applications' software supply chain.
Dasty targets server-side Node.js applications and relies on an enhancement of dynamic taint analysis.
We use Dasty in a study of the most dependent-upon NPM packages to analyze the presence of gadgets leading to ACE.
arXiv Detail & Related papers (2023-11-07T11:55:40Z) - A survey on hardware-based malware detection approaches [45.24207460381396]
Hardware-based malware detection approaches leverage hardware performance counters and machine learning prowess.
We meticulously analyze the approach, unraveling the most common methods, algorithms, tools, and datasets that shape its contours.
The discussion extends to crafting mixed hardware and software approaches for collaborative efficacy, essential enhancements in hardware monitoring units, and a better understanding of the correlation between hardware events and malware applications.
arXiv Detail & Related papers (2023-03-22T13:00:41Z) - Towards an Automated Pipeline for Detecting and Classifying Malware
through Machine Learning [0.0]
We propose a malware taxonomic classification pipeline able to classify Windows Portable Executable files (PEs)
Given an input PE sample, it is first classified as either malicious or benign.
If malicious, the pipeline further analyzes it in order to establish its threat type, family, and behavior(s)
arXiv Detail & Related papers (2021-06-10T10:07:50Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.