Enhancing Webshell Detection With Deep Learning-Powered Methods
- URL: http://arxiv.org/abs/2412.05532v1
- Date: Sat, 07 Dec 2024 04:26:36 GMT
- Title: Enhancing Webshell Detection With Deep Learning-Powered Methods
- Authors: Ha L. Viet, On V. Phung, Hoa N. Nguyen,
- Abstract summary: Webshell attacks are becoming more common, requiring robust detection mechanisms to protect web applications.
The dissertation proposes ASAF, an advanced DL-Powered Source-Code Scanning Framework that uses signature-based methods and deep learning algorithms to detect known and unknown webshells.
Second, the dissertation introduces a deep neural network that detects webshells using real-time HTTP traffic analysis of web applications.
- Score: 0.6390468088226495
- License:
- Abstract: Webshell attacks are becoming more common, requiring robust detection mechanisms to protect web applications. The dissertation clearly states two research directions: scanning web application source code and analyzing HTTP traffic to detect webshells. First, the dissertation proposes ASAF, an advanced DL-Powered Source-Code Scanning Framework that uses signature-based methods and deep learning algorithms to detect known and unknown webshells. We designed the framework to enable programming language-specific detection models. The dissertation used PHP for interpreted language and ASP.NET for compiled language to build a complete ASAF-based model for experimentation and comparison with other research results to prove its efficacy. Second, the dissertation introduces a deep neural network that detects webshells using real-time HTTP traffic analysis of web applications. The study proposes an algorithm to improve the deep learning model's loss function to address data imbalance. We tested and compared the model to other studies on the CSE-CIC-IDS2018 dataset to prove its efficacy. We integrated the model with NetIDPS to improve webshell identification. Automatically blacklist attack source IPs and block URIs querying webshells on the web server to prevent these attacks.
Related papers
- PhishNet: A Phishing Website Detection Tool using XGBoost [1.777434178384403]
PhisNet is a cutting-edge web application designed to detect phishing websites using advanced machine learning.
It aims to help individuals and organizations identify and prevent phishing attacks through a robust AI framework.
arXiv Detail & Related papers (2024-06-29T21:31:13Z) - AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation [54.17246674188208]
Web scraping is a powerful technique that extracts data from websites, enabling automated data collection, enhancing data analysis capabilities, and minimizing manual data entry efforts.
Existing methods, wrappers-based methods suffer from limited adaptability and scalability when faced with a new website.
We introduce the paradigm of generating web scrapers with large language models (LLMs) and propose AutoScraper, a two-stage framework that can handle diverse and changing web environments more efficiently.
arXiv Detail & Related papers (2024-04-19T09:59:44Z) - Large Language Models are Few-shot Generators: Proposing Hybrid Prompt Algorithm To Generate Webshell Escape Samples [1.6223257916285212]
We propose the Hybrid Prompt algorithm for webshell escape sample generation with the help of large language models.
As a prompt algorithm specifically developed for webshell sample generation, the Hybrid Prompt algorithm not only combines various prompt ideas including Chain of Thought, Tree of Thought, but also incorporates various components such as webshell hierarchical module.
Experimental results show that the Hybrid Prompt algorithm can work with multiple LLMs with excellent code reasoning ability to generate high-quality webshell samples.
arXiv Detail & Related papers (2024-02-12T04:59:58Z) - Neural Embeddings for Web Testing [49.66745368789056]
Existing crawlers rely on app-specific, threshold-based, algorithms to assess state equivalence.
We propose WEBEMBED, a novel abstraction function based on neural network embeddings and threshold-free classifiers.
Our evaluation on nine web apps shows that WEBEMBED outperforms state-of-the-art techniques by detecting near-duplicates more accurately.
arXiv Detail & Related papers (2023-06-12T19:59:36Z) - The Web Can Be Your Oyster for Improving Large Language Models [98.72358969495835]
Large language models (LLMs) encode a large amount of world knowledge.
We consider augmenting LLMs with the large-scale web using search engine.
We present a web-augmented LLM UNIWEB, which is trained over 16 knowledge-intensive tasks in a unified text-to-text format.
arXiv Detail & Related papers (2023-05-18T14:20:32Z) - Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models.
We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Detecting Cloud-Based Phishing Attacks by Combining Deep Learning Models [0.0]
Web-based phishing attacks nowadays exploit popular cloud web hosting services and apps such as Google Sites and Typeform for hosting their attacks.
Here we investigate the effectiveness of deep learning models in detecting this class of cloud-based phishing attacks.
arXiv Detail & Related papers (2022-04-05T18:47:57Z) - A Deep Learning Approach for Ontology Enrichment from Unstructured Text [2.932750332087746]
Existing information vulnerabilities on attacks, controls, and advisories available on the web provide an opportunity to represent and perform security analytics.
Ontology enrichment algorithms based on natural language processing and ML models have issues with contextual extraction of concepts in words, phrases, and sentences.
Bidirectional LSTMs trained on a large DB dataset and Wikipedia corpus of 2.8 GB along with Universal Sentence is deployed to enrich ISO-based information security.
arXiv Detail & Related papers (2021-12-16T01:32:21Z) - OntoEnricher: A Deep Learning Approach for Ontology Enrichment from
Unstructured Text [2.707154152696381]
Existing information on vulnerabilities, controls, and advisories available on the web provides an opportunity to represent knowledge and perform analytics to mitigate some of the concerns.
This necessitates dynamic and automated enrichment of information security.
Existing ontology enrichment algorithms based on natural processing and ML models have issues with the contextual extraction of concepts in words, phrases and sentences.
arXiv Detail & Related papers (2021-02-08T09:43:05Z) - Cassandra: Detecting Trojaned Networks from Adversarial Perturbations [92.43879594465422]
In many cases, pre-trained models are sourced from vendors who may have disrupted the training pipeline to insert Trojan behaviors into the models.
We propose a method to verify if a pre-trained model is Trojaned or benign.
Our method captures fingerprints of neural networks in the form of adversarial perturbations learned from the network gradients.
arXiv Detail & Related papers (2020-07-28T19:00:40Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.