Related papers: Integrating Generative AI into Cybersecurity Education: A Study of OCR and Multimodal LLM-assisted Instruction

Integrating Generative AI into Cybersecurity Education: A Study of OCR and Multimodal LLM-assisted Instruction

URL: http://arxiv.org/abs/2509.02998v1
Date: Wed, 03 Sep 2025 04:16:50 GMT
Title: Integrating Generative AI into Cybersecurity Education: A Study of OCR and Multimodal LLM-assisted Instruction
Authors: Karan Patel, Yu-Zheng Lin, Gaurangi Raul, Bono Po-Jen Shih, Matthew W. Redondo, Banafsheh Saber Latibari, Jesus Pacheco, Soheil Salehi, Pratik Satam,
Abstract summary: This full paper describes an LLM-assisted instruction integrated with a virtual cybersecurity lab platform.<n>With rising emphasis on robotics, automation, AI, and security, re-skilling and up-skilling are essential.<n>We present a generative AI instructional assistant integrated into a prior experiential learning platform.
Score: 0.5394948236100675
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This full paper describes an LLM-assisted instruction integrated with a virtual cybersecurity lab platform. The digital transformation of Fourth Industrial Revolution (4IR) systems is reshaping workforce needs, widening skill gaps, especially among older workers. With rising emphasis on robotics, automation, AI, and security, re-skilling and up-skilling are essential. Generative AI can help build this workforce by acting as an instructional assistant to support skill acquisition during experiential learning. We present a generative AI instructional assistant integrated into a prior experiential learning platform. The assistant employs a zero-shot OCR-LLM pipeline within the legacy Cybersecurity Labs-as-a-Service (CLaaS) platform (2015). Text is extracted from slide images using Tesseract OCR, then simplified instructions are generated via a general-purpose LLM, enabling real-time instructional support with minimal infrastructure. The system was evaluated in a live university course where student feedback (n=42) averaged 7.83/10, indicating strong perceived usefulness. A comparative study with multimodal LLMs that directly interpret slide images showed higher performance on visually dense slides, but the OCR-LLM pipeline provided comparable pedagogical value on text-centric slides with much lower computational overhead and cost. This work demonstrates that a lightweight, easily integrable pipeline can effectively extend legacy platforms with modern generative AI, offering scalable enhancements for student comprehension in technical education.

Related papers

ViSS-R1: Self-Supervised Reinforcement Video Reasoning [84.1180294023835]
We introduce a novel self-supervised reinforcement learning GRPO algorithm (Pretext-GRPO) within the standard R1 pipeline.<n>We also propose the ViSS-R1 framework, which streamlines and integrates pretext-task-based self-supervised learning directly into the MLLM's R1 post-training paradigm.
arXiv Detail & Related papers (2025-11-17T07:00:42Z)
Building Specialized Software-Assistant ChatBot with Graph-Based Retrieval-Augmented Generation [0.815557531820863]
We introduce a Graph-based Retrieval-Augmented Generation framework that automatically converts enterprise web applications into state-action knowledge graphs.<n>The framework was co-developed with the AI enterprise RAKAM, in collaboration with Lemon Learning.
arXiv Detail & Related papers (2025-11-07T14:56:45Z)
Ultra Strong Machine Learning: Teaching Humans Active Learning Strategies via Automated AI Explanations [2.530461847957792]
We present LENS, a neuro-symbolic method that combines symbolic program synthesis with large language models (LLMs) to automate the explanation of machine-learned logic programs in natural language.<n>LENS addresses a key limitation of prior USML approaches by replacing hand-crafted explanation templates with scalable automated generation.<n>Our results show no significant human performance improvements, suggesting that comprehensive LLM responses may overwhelm users for simpler problems rather than providing learning support.
arXiv Detail & Related papers (2025-08-31T19:04:31Z)
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions [32.83715417294052]
UniVLA is a new framework for learning cross-embodiment vision-language-action (VLA) policies.<n>We derive task-centric action representations from videos with a latent action model.<n>We obtain state-of-the-art results across multiple manipulation and navigation benchmarks, as well as real-robot deployments.
arXiv Detail & Related papers (2025-05-09T15:11:13Z)
Inclusive STEAM Education: A Framework for Teaching Cod-2 ing and Robotics to Students with Visually Impairment Using 3 Advanced Computer Vision [0.0]
This paper presents a framework that leverages pre-constructed robots and algorithms, such as maze-solving techniques, within an accessible learning environment.<n>The proposed system employs Contrastive Language-Image Pre-training (CLIP) to process global camera-captured maze layouts.<n>Students issue verbal commands, which are refined through CLIP, while robot-mounted stereo cameras provide real-time data processed via Simultaneous SLAM.
arXiv Detail & Related papers (2025-03-06T17:15:12Z)
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy [56.505551117094534]
We introduce LLaRA: Large Language and Robotics Assistant, a framework that formulates robot action policy as visuo-textual conversations.<n>First, we present an automated pipeline to generate conversation-style instruction tuning data for robots from existing behavior cloning datasets.<n>We show that a VLM finetuned with a limited amount of such datasets can produce meaningful action decisions for robotic control.
arXiv Detail & Related papers (2024-06-28T17:59:12Z)
ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning [74.58666091522198]
We present a framework for intuitive robot programming by non-experts. We leverage natural language prompts and contextual information from the Robot Operating System (ROS) Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface.
arXiv Detail & Related papers (2024-06-28T08:28:38Z)
ST-LLM: Large Language Models Are Effective Temporal Learners [58.79456373423189]
Large Language Models (LLMs) have showcased impressive capabilities in text comprehension and generation. How to effectively encode and understand videos in video-based dialogue systems remains to be solved. We propose ST-LLM, an effective video-LLM baseline with spatial-temporal sequence modeling inside LLM.
arXiv Detail & Related papers (2024-03-30T10:11:26Z)
InCoRo: In-Context Learning for Robotics Control with Feedback Loops [4.702566749969133]
InCoRo is a system that uses a classical robotic feedback loop composed of an LLM controller, a scene understanding unit, and a robot. We highlight the generalization capabilities of our system and show that InCoRo surpasses the prior art in terms of the success rate. This research paves the way towards building reliable, efficient, intelligent autonomous systems that adapt to dynamic environments.
arXiv Detail & Related papers (2024-02-07T19:01:11Z)
Unleashing the potential of prompt engineering for large language models [1.6006550105523192]
Review explores the pivotal role of prompt engineering in unleashing the capabilities of Large Language Models (LLMs)<n>Examines both foundational and advanced methodologies of prompt engineering, including techniques such as self-consistency, chain-of-thought, and generated knowledge.<n>Discusses the aspect of AI security, particularly adversarial attacks that exploit vulnerabilities in prompt engineering.
arXiv Detail & Related papers (2023-10-23T09:15:18Z)
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning [49.92517970237088]
We tackle the problem of training a robot to understand multimodal prompts. This type of task poses a major challenge to robots' capability to understand the interconnection and complementarity between vision and language signals. We introduce an effective framework that learns a policy to perform robot manipulation with multimodal prompts.
arXiv Detail & Related papers (2023-10-14T22:24:58Z)
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials [97.95400776235736]
We present a framework based on offline RL that attempts to effectively learn new tasks. It combines pre-training on existing robotic datasets with rapid fine-tuning on a new task, with as few as 10 demonstrations. To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot.
arXiv Detail & Related papers (2022-10-11T06:30:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.