Feasibility Study for Supporting Static Malware Analysis Using LLM
- URL: http://arxiv.org/abs/2411.14905v1
- Date: Fri, 22 Nov 2024 13:03:07 GMT
- Title: Feasibility Study for Supporting Static Malware Analysis Using LLM
- Authors: Shota Fujii, Rei Yamagishi,
- Abstract summary: Large language models (LLMs) are becoming more advanced and widespread.
This study focuses on whether an LLM can be used to support static analysis.
- Score: 0.8057006406834466
- License:
- Abstract: Large language models (LLMs) are becoming more advanced and widespread and have shown their applicability to various domains, including cybersecurity. Static malware analysis is one of the most important tasks in cybersecurity; however, it is time-consuming and requires a high level of expertise. Therefore, we conducted a demonstration experiment focusing on whether an LLM can be used to support static analysis. First, we evaluated the ability of the LLM to explain malware functionality. The results showed that the LLM can generate descriptions that cover functions with an accuracy of up to 90.9\%. In addition, we asked six static analysts to perform a pseudo static analysis task using LLM explanations to verify that the LLM can be used in practice. Through subsequent questionnaires and interviews with the participants, we also demonstrated the practical applicability of LLMs. Lastly, we summarized the problems and required functions when using an LLM as static analysis support, as well as recommendations for future research opportunities.
Related papers
- The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead? [60.01746782465275]
Large Language Models (LLMs) have shown capabilities close to human performance in various analytical tasks.
This paper investigates the efficiency and accuracy of LLMs in specialized tasks through a structured user study focusing on Human-LLM partnership.
arXiv Detail & Related papers (2024-10-07T02:30:18Z) - Harnessing LLMs for Automated Video Content Analysis: An Exploratory Workflow of Short Videos on Depression [8.640838598568605]
We conduct a case study that followed a new workflow of Large Language Models (LLMs)-assisted multimodal content analysis.
To test LLM's video annotation capabilities, we analyzed 203s extracted from 25 short videos about depression.
arXiv Detail & Related papers (2024-06-27T21:03:56Z) - Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - Multitask-based Evaluation of Open-Source LLM on Software Vulnerability [2.7692028382314815]
This paper proposes a pipeline for quantitatively evaluating interactive Large Language Models (LLMs) using publicly available datasets.
We carry out an extensive technical evaluation of LLMs using Big-Vul covering four different common software vulnerability tasks.
We find that the existing state-of-the-art approaches and pre-trained Language Models (LMs) are generally superior to LLMs in software vulnerability detection.
arXiv Detail & Related papers (2024-04-02T15:52:05Z) - The Emergence of Large Language Models in Static Analysis: A First Look
through Micro-Benchmarks [3.848607479075651]
We investigate the role that current Large Language Models (LLMs) can play in improving callgraph analysis and type inference for Python programs.
Our study reveals that LLMs show promising results in type inference, demonstrating higher accuracy than traditional methods, yet they exhibit limitations in callgraph analysis.
arXiv Detail & Related papers (2024-02-27T16:53:53Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - T-Eval: Evaluating the Tool Utilization Capability of Large Language
Models Step by Step [69.64348626180623]
Large language models (LLM) have achieved remarkable performance on various NLP tasks.
How to evaluate and analyze the tool-utilization capability of LLMs is still under-explored.
We introduce T-Eval to evaluate the tool utilization capability step by step.
arXiv Detail & Related papers (2023-12-21T17:02:06Z) - Survey on Factuality in Large Language Models: Knowledge, Retrieval and
Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs)
As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.