Static Semantics Reconstruction for Enhancing JavaScript-WebAssembly Multilingual Malware Detection
- URL: http://arxiv.org/abs/2310.17304v2
- Date: Fri, 19 Apr 2024 08:55:03 GMT
- Title: Static Semantics Reconstruction for Enhancing JavaScript-WebAssembly Multilingual Malware Detection
- Authors: Yifan Xia, Ping He, Xuhong Zhang, Peiyu Liu, Shouling Ji, Wenhai Wang,
- Abstract summary: WebAssembly allows attackers to hide the malicious functionalities of JavaScript malware in cross-language interoperations.
The detection of JavaScript-WebAssembly multilingual malware (JWMM) is challenging due to the complex interoperations and semantic diversity between JavaScript and WebAssembly.
We present JWBinder, the first technique aimed at enhancing the static detection of JWMM.
- Score: 51.15122099046214
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The emergence of WebAssembly allows attackers to hide the malicious functionalities of JavaScript malware in cross-language interoperations, termed JavaScript-WebAssembly multilingual malware (JWMM). However, existing anti-virus solutions based on static program analysis are still limited to monolingual code. As a result, their detection effectiveness decreases significantly against JWMM. The detection of JWMM is challenging due to the complex interoperations and semantic diversity between JavaScript and WebAssembly. To bridge this gap, we present JWBinder, the first technique aimed at enhancing the static detection of JWMM. JWBinder performs a language-specific data-flow analysis to capture the cross-language interoperations and then characterizes the functionalities of JWMM through a unified high-level structure called Inter-language Program Dependency Graph. The extensive evaluation on one of the most representative real-world anti-virus platforms, VirusTotal, shows that \system effectively enhances anti-virus systems from various vendors and increases the overall successful detection rate against JWMM from 49.1\% to 86.2\%. Additionally, we assess the side effects and runtime overhead of JWBinder, corroborating its practical viability in real-world applications.
Related papers
- AlignXIE: Improving Multilingual Information Extraction by Cross-Lingual Alignment [62.69772800910482]
AlignXIE formulates IE across different languages, especially non-English ones, as code generation tasks.
It incorporates an IE cross-lingual alignment phase through a translated instance prediction task.
It surpasses ChatGPT by $30.17%$ and SoTA by $20.03%$, thereby demonstrating superior cross-lingual IE capabilities.
arXiv Detail & Related papers (2024-11-07T15:36:05Z) - Fakeium: A Dynamic Execution Environment for JavaScript Program Analysis [3.7980955101286322]
Fakeium is a novel, open source, and lightweight execution environment designed for efficient, large-scale dynamic analysis of JavaScript programs.
Fakeium complements traditional static analysis by providing additional API calls and string literals.
Fakeium's flexibility and ability to detect hidden API calls, especially in obfuscated sources, highlights its potential as a valuable tool for security analysts to detect malicious behavior.
arXiv Detail & Related papers (2024-10-28T09:27:26Z) - EMMA: Efficient Visual Alignment in Multi-Modal LLMs [56.03417732498859]
EMMA is a lightweight cross-modality module designed to efficiently fuse visual and textual encodings.
EMMA boosts performance across multiple tasks by up to 9.3% while significantly improving robustness against hallucinations.
arXiv Detail & Related papers (2024-10-02T23:00:31Z) - $\textit{MMJ-Bench}$: A Comprehensive Study on Jailbreak Attacks and Defenses for Multimodal Large Language Models [11.02754617539271]
We introduce textitMMJ-Bench, a unified pipeline for evaluating jailbreak attacks and defense techniques for MLLMs.
We assess the effectiveness of various attack methods against SoTA MLLMs and evaluate the impact of defense mechanisms on both defense effectiveness and model utility.
arXiv Detail & Related papers (2024-08-16T00:18:23Z) - Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls [3.5698678013121334]
This work presents a novel framework leveraging large language models (LLMs) to classify malware based on system call data.
Experiments with a dataset of over 1TB of system calls demonstrate that models with larger context sizes, such as BigBird and Longformer, achieve superior accuracy and F1-Score of approximately 0.86.
This approach shows significant potential for real-time detection in high-stakes environments, offering a robust solution to evolving cyber threats.
arXiv Detail & Related papers (2024-05-15T13:19:43Z) - Backdoor Attack on Multilingual Machine Translation [53.28390057407576]
multilingual machine translation (MNMT) systems have security vulnerabilities.
An attacker injects poisoned data into a low-resource language pair to cause malicious translations in other languages.
This type of attack is of particular concern, given the larger attack surface of languages inherent to low-resource settings.
arXiv Detail & Related papers (2024-04-03T01:32:31Z) - AIM: Automated Input Set Minimization for Metamorphic Security Testing [9.232277700524786]
We propose AIM, an approach that automatically selects inputs to reduce testing costs while preserving vulnerability detection capabilities.
AIM includes a clustering-based black-box approach, to identify similar inputs based on their security properties.
It also relies on a novel genetic algorithm to efficiently select diverse inputs while minimizing their total cost.
arXiv Detail & Related papers (2024-02-16T15:54:58Z) - DIALIGHT: Lightweight Multilingual Development and Evaluation of
Task-Oriented Dialogue Systems with Large Language Models [76.79929883963275]
DIALIGHT is a toolkit for developing and evaluating multilingual Task-Oriented Dialogue (ToD) systems.
It features a secure, user-friendly web interface for fine-grained human evaluation at both local utterance level and global dialogue level.
Our evaluations reveal that while PLM fine-tuning leads to higher accuracy and coherence, LLM-based systems excel in producing diverse and likeable responses.
arXiv Detail & Related papers (2024-01-04T11:27:48Z) - On the Effectiveness of Adversarial Samples against Ensemble
Learning-based Windows PE Malware Detectors [0.0]
We propose a mutation system to counteract ensemble learning-based detectors by combining GANs and an RL model.
In the FeaGAN model, ensemble learning is utilized to enhance the malware detector's evasion ability, with the generated adversarial patterns.
arXiv Detail & Related papers (2023-09-25T02:57:27Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.