Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
- URL: http://arxiv.org/abs/2504.21039v1
- Date: Mon, 28 Apr 2025 08:41:12 GMT
- Title: Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
- Authors: Paul Kassianik, Baturay Saglam, Alexander Chen, Blaine Nelson, Anu Vellore, Massimo Aufiero, Fraser Burch, Dhruv Kedia, Avi Zohary, Sajana Weerawardhena, Aman Priyanshu, Adam Swanda, Amy Chang, Hyrum Anderson, Kojin Oshiba, Omar Santos, Yaron Singer, Amin Karbasi,
- Abstract summary: We present Foundation-Sec-8B, a cybersecurity-focused large language model (LLMs) built on the Llama 3.1 architecture.<n>We evaluate it across both established and new cybersecurity benchmarks, showing that it matches Llama 3.1-70B and GPT-4o-mini in certain cybersecurity-specific tasks.<n>By releasing our model to the public, we aim to accelerate progress and adoption of AI-driven tools in both public and private cybersecurity contexts.
- Score: 50.268821168513654
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As transformer-based large language models (LLMs) increasingly permeate society, they have revolutionized domains such as software engineering, creative writing, and digital arts. However, their adoption in cybersecurity remains limited due to challenges like scarcity of specialized training data and complexity of representing cybersecurity-specific knowledge. To address these gaps, we present Foundation-Sec-8B, a cybersecurity-focused LLM built on the Llama 3.1 architecture and enhanced through continued pretraining on a carefully curated cybersecurity corpus. We evaluate Foundation-Sec-8B across both established and new cybersecurity benchmarks, showing that it matches Llama 3.1-70B and GPT-4o-mini in certain cybersecurity-specific tasks. By releasing our model to the public, we aim to accelerate progress and adoption of AI-driven tools in both public and private cybersecurity contexts.
Related papers
- The Digital Cybersecurity Expert: How Far Have We Come? [49.89857422097055]
We develop CSEBenchmark, a fine-grained cybersecurity evaluation framework based on 345 knowledge points expected of cybersecurity experts.<n>We evaluate 12 popular large language models (LLMs) on CSEBenchmark and find that even the best-performing model achieves only 85.42% overall accuracy.<n>By identifying and addressing specific knowledge gaps in each LLM, we achieve up to an 84% improvement in correcting previously incorrect predictions.
arXiv Detail & Related papers (2025-04-16T05:36:28Z) - Global Challenge for Safe and Secure LLMs Track 1 [57.08717321907755]
The Global Challenge for Safe and Secure Large Language Models (LLMs) is a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO)
This paper introduces the Global Challenge for Safe and Secure Large Language Models (LLMs), a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) to foster the development of advanced defense mechanisms against automated jailbreaking attacks.
arXiv Detail & Related papers (2024-11-21T08:20:31Z) - CyberPal.AI: Empowering LLMs with Expert-Driven Cybersecurity Instructions [0.2999888908665658]
Large Language Models (LLMs) have significantly advanced natural language processing (NLP) capabilities, providing versatile capabilities across various applications.
However, their application to complex, domain-specific tasks, such as cyber-security, often faces substantial challenges.
In this study, we introduce SecKnowledge and CyberPal.AI to address these challenges and train security-expert LLMs.
arXiv Detail & Related papers (2024-08-17T22:37:39Z) - Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities [1.0974825157329373]
This paper provides a comprehensive review of the future of cybersecurity through Generative AI and Large Language Models (LLMs)<n>We explore LLM applications across various domains, including hardware design security, intrusion detection, software engineering, design verification, cyber threat intelligence, malware detection, and phishing detection.<n>We present an overview of LLM evolution and its current state, focusing on advancements in models such as GPT-4, GPT-3.5, Mixtral-8x7B, BERT, Falcon2, and LLaMA.
arXiv Detail & Related papers (2024-05-21T13:02:27Z) - SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence [27.550484938124193]
This paper introduces a framework to benchmark, elicit, and improve cybersecurity incident analysis and response abilities.
We create a high-quality bilingual instruction corpus by crawling cybersecurity raw text from cybersecurity websites.
The instruction dataset SEvenLLM-Instruct is used to train cybersecurity LLMs with the multi-task learning objective.
arXiv Detail & Related papers (2024-05-06T13:17:43Z) - The Security and Privacy of Mobile Edge Computing: An Artificial Intelligence Perspective [64.36680481458868]
Mobile Edge Computing (MEC) is a new computing paradigm that enables cloud computing and information technology (IT) services to be delivered at the network's edge.
This paper provides a survey of security and privacy in MEC from the perspective of Artificial Intelligence (AI)
We focus on new security and privacy issues, as well as potential solutions from the viewpoints of AI.
arXiv Detail & Related papers (2024-01-03T07:47:22Z) - Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models [41.068780235482514]
This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants.
CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their level of compliance when asked to assist in cyberattacks.
arXiv Detail & Related papers (2023-12-07T22:07:54Z) - Graph Mining for Cybersecurity: A Survey [61.505995908021525]
The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society.
Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities.
With the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance.
arXiv Detail & Related papers (2023-04-02T08:43:03Z) - Recognizing and Extracting Cybersecurtity-relevant Entities from Text [1.7499351967216343]
Cyber Threat Intelligence (CTI) is information describing threat vectors, vulnerabilities, and attacks.
CTI is often used as training data for AI-based cyber defense systems such as Cybersecurity Knowledge Graphs (CKG)
arXiv Detail & Related papers (2022-08-02T18:44:06Z) - Proceedings of the Artificial Intelligence for Cyber Security (AICS)
Workshop at AAAI 2022 [55.573187938617636]
The workshop will focus on the application of AI to problems in cyber security.
Cyber systems generate large volumes of data, utilizing this effectively is beyond human capabilities.
arXiv Detail & Related papers (2022-02-28T18:27:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.