EdgeAIGuard: Agentic LLMs for Minor Protection in Digital Spaces
- URL: http://arxiv.org/abs/2503.00092v1
- Date: Fri, 28 Feb 2025 16:29:34 GMT
- Title: EdgeAIGuard: Agentic LLMs for Minor Protection in Digital Spaces
- Authors: Ghulam Mujtaba, Sunder Ali Khowaja, Kapal Dev,
- Abstract summary: We propose the EdgeAIGuard content moderation approach to protect minors from online grooming and various forms of digital exploitation.<n>The proposed method comprises a multi-agent architecture deployed strategically at the network edge to enable rapid detection with low latency and prevent harmful content targeting minors.
- Score: 13.180252900900854
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Social media has become integral to minors' daily lives and is used for various purposes, such as making friends, exploring shared interests, and engaging in educational activities. However, the increase in screen time has also led to heightened challenges, including cyberbullying, online grooming, and exploitations posed by malicious actors. Traditional content moderation techniques have proven ineffective against exploiters' evolving tactics. To address these growing challenges, we propose the EdgeAIGuard content moderation approach that is designed to protect minors from online grooming and various forms of digital exploitation. The proposed method comprises a multi-agent architecture deployed strategically at the network edge to enable rapid detection with low latency and prevent harmful content targeting minors. The experimental results show the proposed method is significantly more effective than the existing approaches.
Related papers
- Ethical AI for Young Digital Citizens: A Call to Action on Privacy Governance [0.0]
The rapid expansion of Artificial Intelligence in digital platforms used by youth has created significant challenges related to privacy, autonomy, and data protection.
While AI-driven personalization offers enhanced user experiences, it often operates without clear ethical boundaries, leaving young users vulnerable to data exploitation and algorithmic biases.
This paper presents a call to action for ethical AI governance, advocating for a structured framework that ensures youth-centred privacy protections, transparent data practices, and regulatory oversight.
arXiv Detail & Related papers (2025-03-15T01:35:56Z) - ShieldLearner: A New Paradigm for Jailbreak Attack Defense in LLMs [4.534938642552179]
ShieldLearner is a novel paradigm that mimics human learning in defense.<n>Through trial and error, it autonomously distills attack signatures into a Pattern Atlas.<n> Adaptive Adversarial Augmentation generates adversarial variations of successfully defended prompts.
arXiv Detail & Related papers (2025-02-16T18:47:41Z) - Antelope: Potent and Concealed Jailbreak Attack Strategy [7.970002819722513]
Antelope is a more robust and covert jailbreak attack strategy designed to expose security vulnerabilities inherent in generative models.<n>We successfully exploit the transferability of model-based attacks to penetrate online black-box services.
arXiv Detail & Related papers (2024-12-11T07:22:51Z) - Edge-Only Universal Adversarial Attacks in Distributed Learning [49.546479320670464]
In this work, we explore the feasibility of generating universal adversarial attacks when an attacker has access to the edge part of the model only.
Our approach shows that adversaries can induce effective mispredictions in the unknown cloud part by leveraging key features on the edge side.
Our results on ImageNet demonstrate strong attack transferability to the unknown cloud part.
arXiv Detail & Related papers (2024-11-15T11:06:24Z) - Model Inversion Attacks: A Survey of Approaches and Countermeasures [59.986922963781]
Recently, a new type of privacy attack, the model inversion attacks (MIAs), aims to extract sensitive features of private data for training.
Despite the significance, there is a lack of systematic studies that provide a comprehensive overview and deeper insights into MIAs.
This survey aims to summarize up-to-date MIA methods in both attacks and defenses.
arXiv Detail & Related papers (2024-11-15T08:09:28Z) - ID-Guard: A Universal Framework for Combating Facial Manipulation via Breaking Identification [60.73617868629575]
misuse of deep learning-based facial manipulation poses a significant threat to civil rights.
To prevent this fraud at its source, proactive defense has been proposed to disrupt the manipulation process.
This paper proposes a universal framework for combating facial manipulation, termed ID-Guard.
arXiv Detail & Related papers (2024-09-20T09:30:08Z) - Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis [2.424910201171407]
Online grooming (OG) is a prevalent threat facing predominately children online, with groomers using deceptive methods to prey on the vulnerability of children on social media/messaging platforms.
Existing solutions focus on the signature analysis of child abuse media, which does not effectively address real-time OG detection.
This paper proposes that OG attacks are complex, requiring the identification of specific communication patterns between adults and children.
arXiv Detail & Related papers (2024-09-12T11:37:34Z) - SecureReg: Combining NLP and MLP for Enhanced Detection of Malicious Domain Name Registrations [0.0]
This paper introduces a cutting-edge approach for identifying suspicious domains at the onset of the registration process.
The proposed system analyzes semantic and numerical attributes by leveraging a novel combination of Natural Language Processing (NLP) techniques.
With an F1 score of 84.86% and an accuracy of 84.95% on the SecureReg dataset, it effectively detects malicious domain registrations.
arXiv Detail & Related papers (2024-01-06T11:43:57Z) - A Robust Adversary Detection-Deactivation Method for Metaverse-oriented
Collaborative Deep Learning [13.131323206843733]
This paper proposes an adversary detection-deactivation method, which can limit and isolate the access of potential malicious participants.
A detailed protection analysis has been conducted on a Multiview CDL case, and results show that the protocol can effectively prevent harmful access by manner analysis.
arXiv Detail & Related papers (2023-10-21T06:45:18Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Initiative Defense against Facial Manipulation [82.96864888025797]
We propose a novel framework of initiative defense to degrade the performance of facial manipulation models controlled by malicious users.
We first imitate the target manipulation model with a surrogate model, and then devise a poison perturbation generator to obtain the desired venom.
arXiv Detail & Related papers (2021-12-19T09:42:28Z) - Inspect, Understand, Overcome: A Survey of Practical Methods for AI
Safety [54.478842696269304]
The use of deep neural networks (DNNs) in safety-critical applications is challenging due to numerous model-inherent shortcomings.
In recent years, a zoo of state-of-the-art techniques aiming to address these safety concerns has emerged.
Our paper addresses both machine learning experts and safety engineers.
arXiv Detail & Related papers (2021-04-29T09:54:54Z) - WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos [124.72839555467944]
We propose a weakly supervised framework that can be trained using only video-class labels.
We show that our method largely outperforms weakly-supervised baselines.
When strongly supervised, our method obtains the state-of-the-art results in the tasks of both online per-frame action recognition and online detection of action start.
arXiv Detail & Related papers (2020-06-05T23:08:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.