Related papers: A Survey of Data Security: Practices from Cybersecurity and Challenges of Machine Learning

Related papers

A Survey on Data Security in Large Language Models [12.23432845300652]
Large Language Models (LLMs) are a foundation in advancing natural language processing, power applications such as text generation, machine translation, and conversational systems.<n>Despite their transformative potential, these models inherently rely on massive amounts of training data, often collected from diverse and uncurated sources, which exposes them to serious data security risks.<n>Harmful or malicious data can compromise model behavior, leading to issues such as toxic output, hallucinations, and vulnerabilities to threats such as prompt injection or data poisoning.<n>This survey offers a comprehensive overview of the main data security risks facing LLMs and reviews current defense strategies, including adversarial
arXiv Detail & Related papers (2025-08-04T11:28:34Z)
A Review of Various Datasets for Machine Learning Algorithm-Based Intrusion Detection System: Advances and Challenges [0.40964539027092917]
IDS aims to protect computer networks from security threats by detecting, notifying, and taking appropriate action to prevent illegal access and protect confidential information.<n>Researchers are enhancing the effectiveness of IDS by incorporating popular datasets into machine learning algorithms.<n>This paper explores the methods of capturing and reviewing intrusion detection systems (IDS) and evaluates the challenges existing datasets face.
arXiv Detail & Related papers (2025-06-03T04:47:21Z)
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report [50.268821168513654]
We present Foundation-Sec-8B, a cybersecurity-focused large language model (LLMs) built on the Llama 3.1 architecture. We evaluate it across both established and new cybersecurity benchmarks, showing that it matches Llama 3.1-70B and GPT-4o-mini in certain cybersecurity-specific tasks. By releasing our model to the public, we aim to accelerate progress and adoption of AI-driven tools in both public and private cybersecurity contexts.
arXiv Detail & Related papers (2025-04-28T08:41:12Z)
Large Language Models for Cyber Security: A Systematic Literature Review [14.924782327303765]
We conduct a comprehensive review of the literature on the application of Large Language Models in cybersecurity (LLM4Security) We observe that LLMs are being applied to a wide range of cybersecurity tasks, including vulnerability detection, malware analysis, network intrusion detection, and phishing detection. Third, we identify several promising techniques for adapting LLMs to specific cybersecurity domains, such as fine-tuning, transfer learning, and domain-specific pre-training.
arXiv Detail & Related papers (2024-05-08T02:09:17Z)
Threats, Attacks, and Defenses in Machine Unlearning: A Survey [14.03428437751312]
Machine Unlearning (MU) has recently gained considerable attention due to its potential to achieve Safe AI. This survey aims to fill the gap between the extensive number of studies on threats, attacks, and defenses in machine unlearning.
arXiv Detail & Related papers (2024-03-20T15:40:18Z)
XFedHunter: An Explainable Federated Learning Framework for Advanced Persistent Threat Detection in SDN [0.0]
This work proposes XFedHunter, an explainable federated learning framework for APT detection in Software-Defined Networking (SDN) In XFedHunter, Graph Neural Network (GNN) and Deep Learning model are utilized to reveal the malicious events effectively. The experimental results on NF-ToN-IoT and DARPA TCE3 datasets indicate that our framework can enhance the trust and accountability of ML-based systems.
arXiv Detail & Related papers (2023-09-15T15:44:09Z)
Robustness Evaluation of Deep Unsupervised Learning Algorithms for Intrusion Detection Systems [0.0]
This paper evaluates the robustness of six recent deep learning algorithms for intrusion detection on contaminated data. Our experiments suggest that the state-of-the-art algorithms used in this study are sensitive to data contamination and reveal the importance of self-defense against data perturbation.
arXiv Detail & Related papers (2022-06-25T02:28:39Z)
Distributed Machine Learning and the Semblance of Trust [66.1227776348216]
Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data. FL and related techniques are often described as privacy-preserving. We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
arXiv Detail & Related papers (2021-12-21T08:44:05Z)
RoFL: Attestable Robustness for Secure Federated Learning [59.63865074749391]
Federated Learning allows a large number of clients to train a joint model without the need to share their private data. To ensure the confidentiality of the client updates, Federated Learning systems employ secure aggregation. We present RoFL, a secure Federated Learning system that improves robustness against malicious clients.
arXiv Detail & Related papers (2021-07-07T15:42:49Z)
From Distributed Machine Learning to Federated Learning: A Survey [49.7569746460225]
Federated learning emerges as an efficient approach to exploit distributed data and computing resources. We propose a functional architecture of federated learning systems and a taxonomy of related techniques. We present the distributed training, data communication, and security of FL systems.
arXiv Detail & Related papers (2021-04-29T14:15:11Z)
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits. In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z)
Confidential Machine Learning on Untrusted Platforms: A Survey [10.45742327204133]
We will focus on the cryptographic approaches for confidential machine learning (CML) We will also cover other directions such as perturbation-based approaches and CML in the hardware-assisted confidential computing environment. The discussion will take a holistic way to consider a rich context of the related threat models, security assumptions, attacks, design philosophies, and associated trade-offs amongst data utility, cost, and confidentiality.
arXiv Detail & Related papers (2020-12-15T08:57:02Z)
Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance. We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z)
Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain [58.30296637276011]
This paper summarizes the latest research on adversarial attacks against security solutions based on machine learning techniques. It is the first to discuss the unique challenges of implementing end-to-end adversarial attacks in the cyber security domain.
arXiv Detail & Related papers (2020-07-05T18:22:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.