Static Semantics Reconstruction for Enhancing JavaScript-WebAssembly Multilingual Malware Detection
- URL: http://arxiv.org/abs/2310.17304v2
- Date: Fri, 19 Apr 2024 08:55:03 GMT
- Title: Static Semantics Reconstruction for Enhancing JavaScript-WebAssembly Multilingual Malware Detection
- Authors: Yifan Xia, Ping He, Xuhong Zhang, Peiyu Liu, Shouling Ji, Wenhai Wang,
- Abstract summary: WebAssembly allows attackers to hide the malicious functionalities of JavaScript malware in cross-language interoperations.
The detection of JavaScript-WebAssembly multilingual malware (JWMM) is challenging due to the complex interoperations and semantic diversity between JavaScript and WebAssembly.
We present JWBinder, the first technique aimed at enhancing the static detection of JWMM.
- Score: 51.15122099046214
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The emergence of WebAssembly allows attackers to hide the malicious functionalities of JavaScript malware in cross-language interoperations, termed JavaScript-WebAssembly multilingual malware (JWMM). However, existing anti-virus solutions based on static program analysis are still limited to monolingual code. As a result, their detection effectiveness decreases significantly against JWMM. The detection of JWMM is challenging due to the complex interoperations and semantic diversity between JavaScript and WebAssembly. To bridge this gap, we present JWBinder, the first technique aimed at enhancing the static detection of JWMM. JWBinder performs a language-specific data-flow analysis to capture the cross-language interoperations and then characterizes the functionalities of JWMM through a unified high-level structure called Inter-language Program Dependency Graph. The extensive evaluation on one of the most representative real-world anti-virus platforms, VirusTotal, shows that \system effectively enhances anti-virus systems from various vendors and increases the overall successful detection rate against JWMM from 49.1\% to 86.2\%. Additionally, we assess the side effects and runtime overhead of JWBinder, corroborating its practical viability in real-world applications.
Related papers
- Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls [3.5698678013121334]
This work presents a novel framework leveraging large language models (LLMs) to classify malware based on system call data.
Experiments with a dataset of over 1TB of system calls demonstrate that models with larger context sizes, such as BigBird and Longformer, achieve superior accuracy and F1-Score of approximately 0.86.
This approach shows significant potential for real-time detection in high-stakes environments, offering a robust solution to evolving cyber threats.
arXiv Detail & Related papers (2024-05-15T13:19:43Z) - AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering [1.3197408989895103]
AppPoet is a multi-view system for Android malware detection.
Our method achieves a detection accuracy of 97.15% and an F1 score of 97.21%.
arXiv Detail & Related papers (2024-04-29T15:52:45Z) - Backdoor Attack on Multilingual Machine Translation [53.28390057407576]
multilingual machine translation (MNMT) systems have security vulnerabilities.
An attacker injects poisoned data into a low-resource language pair to cause malicious translations in other languages.
This type of attack is of particular concern, given the larger attack surface of languages inherent to low-resource settings.
arXiv Detail & Related papers (2024-04-03T01:32:31Z) - AIM: Automated Input Set Minimization for Metamorphic Security Testing [9.232277700524786]
We propose AIM, an approach that automatically selects inputs to reduce testing costs while preserving vulnerability detection capabilities.
AIM includes a clustering-based black box approach, to identify similar inputs based on their security properties.
It also relies on a novel genetic algorithm able to efficiently select diverse inputs while minimizing their total cost.
arXiv Detail & Related papers (2024-02-16T15:54:58Z) - DIALIGHT: Lightweight Multilingual Development and Evaluation of
Task-Oriented Dialogue Systems with Large Language Models [76.79929883963275]
DIALIGHT is a toolkit for developing and evaluating multilingual Task-Oriented Dialogue (ToD) systems.
It features a secure, user-friendly web interface for fine-grained human evaluation at both local utterance level and global dialogue level.
Our evaluations reveal that while PLM fine-tuning leads to higher accuracy and coherence, LLM-based systems excel in producing diverse and likeable responses.
arXiv Detail & Related papers (2024-01-04T11:27:48Z) - On the Effectiveness of Adversarial Samples against Ensemble
Learning-based Windows PE Malware Detectors [0.0]
We propose a mutation system to counteract ensemble learning-based detectors by combining GANs and an RL model.
In the FeaGAN model, ensemble learning is utilized to enhance the malware detector's evasion ability, with the generated adversarial patterns.
arXiv Detail & Related papers (2023-09-25T02:57:27Z) - A New Deep Boosted CNN and Ensemble Learning based IoT Malware Detection [0.0]
Security issues are threatened in various types of networks, especially in the Internet of Things (IoT) environment.
We have developed a new malware detection framework, Deep Squeezed-Boosted and Ensemble Learning (DSBEL), comprised of novel Squeezed-Boosted Boundary-Region Split-Transform-Merge (SB-BR-STM) CNN and ensemble learning.
arXiv Detail & Related papers (2022-12-15T18:14:51Z) - Multi-modal Transformers Excel at Class-agnostic Object Detection [105.10403103027306]
We argue that existing methods lack a top-down supervision signal governed by human-understandable semantics.
We develop an efficient and flexible MViT architecture using multi-scale feature processing and deformable self-attention.
We show the significance of MViT proposals in a diverse range of applications.
arXiv Detail & Related papers (2021-11-22T18:59:29Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.