Related papers: Student Assessment in Cybersecurity Training Automated by Pattern Mining and Clustering

Student Assessment in Cybersecurity Training Automated by Pattern Mining and Clustering

URL: http://arxiv.org/abs/2307.10260v1
Date: Thu, 13 Jul 2023 18:52:58 GMT
Title: Student Assessment in Cybersecurity Training Automated by Pattern Mining and Clustering
Authors: Valdemar \v{S}v\'abensk\'y, Jan Vykopal, Pavel \v{C}eleda, Kristi\'an Tk\'a\v{c}ik, Daniel Popovi\v{c}
Abstract summary: This paper explores a dataset from 18 cybersecurity training sessions using data mining and machine learning techniques. We employed pattern mining and clustering to analyze 8834 commands collected from 113 trainees. Our results show that data mining methods are suitable for analyzing cybersecurity training data.
Score: 0.5249805590164902
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hands-on cybersecurity training allows students and professionals to practice various tools and improve their technical skills. The training occurs in an interactive learning environment that enables completing sophisticated tasks in full-fledged operating systems, networks, and applications. During the training, the learning environment allows collecting data about trainees' interactions with the environment, such as their usage of command-line tools. These data contain patterns indicative of trainees' learning processes, and revealing them allows to assess the trainees and provide feedback to help them learn. However, automated analysis of these data is challenging. The training tasks feature complex problem-solving, and many different solution approaches are possible. Moreover, the trainees generate vast amounts of interaction data. This paper explores a dataset from 18 cybersecurity training sessions using data mining and machine learning techniques. We employed pattern mining and clustering to analyze 8834 commands collected from 113 trainees, revealing their typical behavior, mistakes, solution strategies, and difficult training stages. Pattern mining proved suitable in capturing timing information and tool usage frequency. Clustering underlined that many trainees often face the same issues, which can be addressed by targeted scaffolding. Our results show that data mining methods are suitable for analyzing cybersecurity training data. Educational researchers and practitioners can apply these methods in their contexts to assess trainees, support them, and improve the training design. Artifacts associated with this research are publicly available.

Related papers

Dynamic Skill Adaptation for Large Language Models [78.31322532135272]
We present Dynamic Skill Adaptation (DSA), an adaptive and dynamic framework to adapt novel and complex skills to Large Language Models (LLMs) For every skill, we utilize LLMs to generate both textbook-like data which contains detailed descriptions of skills for pre-training and exercise-like data which targets at explicitly utilizing the skills to solve problems for instruction-tuning. Experiments on large language models such as LLAMA and Mistral demonstrate the effectiveness of our proposed methods in adapting math reasoning skills and social study skills.
arXiv Detail & Related papers (2024-12-26T22:04:23Z)
Automatic Identification and Visualization of Group Training Activities Using Wearable Data [7.130450173185638]
Human Activity Recognition (HAR) identifies daily activities from time-series data collected by wearable devices like smartwatches. This paper presents a comprehensive framework for imputing, analyzing, and identifying activities from wearable data. Our approach is based on data collected from 135 soldiers wearing Garmin 55 smartwatches over six months.
arXiv Detail & Related papers (2024-10-07T19:35:15Z)
Detecting Unsuccessful Students in Cybersecurity Exercises in Two Different Learning Environments [0.37729165787434493]
This paper develops automated tools to predict when a student is having difficulty. In a potential application, such models can aid instructors in detecting struggling students and providing targeted help.
arXiv Detail & Related papers (2024-08-16T04:57:54Z)
Deep Internal Learning: Deep Learning from a Single Input [88.59966585422914]
In many cases there is value in training a network just from the input at hand. This is particularly relevant in many signal and image processing problems where training data is scarce and diversity is large. This survey paper aims at covering deep internal-learning techniques that have been proposed in the past few years for these two important directions.
arXiv Detail & Related papers (2023-12-12T16:48:53Z)
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware [35.29390454207064]
Dexterous manipulation in particular remains an open problem in its general form. We propose a benchmark including a large collection of data for offline learning from a dexterous manipulation platform on two tasks. We evaluate prominent open-sourced offline reinforcement learning algorithms on the datasets and provide a reproducible experimental setup for offline reinforcement learning on real systems.
arXiv Detail & Related papers (2023-07-28T17:29:49Z)
Applications of Educational Data Mining and Learning Analytics on Data From Cybersecurity Training [0.5735035463793008]
This paper surveys publications that enhance cybersecurity education by leveraging trainee-generated data from interactive learning environments. We identified and examined 3021 papers, ultimately selecting 35 articles for a detailed review. Our contribution is a systematic literature review of relevant papers and their categorization according to the collected data, analysis methods, and application contexts.
arXiv Detail & Related papers (2023-07-13T19:05:17Z)
Offline Robot Reinforcement Learning with Uncertainty-Guided Human Expert Sampling [11.751910133386254]
Recent advances in batch (offline) reinforcement learning have shown promising results in learning from available offline data. We propose a novel approach that uses uncertainty estimation to trigger the injection of human demonstration data. Our experiments show that this approach is more sample efficient when compared to a naive way of combining expert data with data collected from a sub-optimal agent.
arXiv Detail & Related papers (2022-12-16T01:41:59Z)
Toolset for Collecting Shell Commands and Its Application in Hands-on Cybersecurity Training [0.5735035463793008]
We share the design and implementation of an open-source toolset for logging commands that students execute on Linux machines. Compared to basic solutions, such as shell history files, the toolset's added value is threefold. Data are instantly forwarded to central storage in a unified, semi-structured format.
arXiv Detail & Related papers (2021-12-21T11:45:13Z)
Understanding the World Through Action [91.3755431537592]
I will argue that a general, principled, and powerful framework for utilizing unlabeled data can be derived from reinforcement learning. I will discuss how such a procedure is more closely aligned with potential downstream tasks.
arXiv Detail & Related papers (2021-10-24T22:33:52Z)
Motivating Learners in Multi-Orchestrator Mobile Edge Learning: A Stackelberg Game Approach [54.28419430315478]
Mobile Edge Learning enables distributed training of Machine Learning models over heterogeneous edge devices. In MEL, the training performance deteriorates without the availability of sufficient training data or computing resources. We propose an incentive mechanism, where we formulate the orchestrators-learners interactions as a 2-round Stackelberg game.
arXiv Detail & Related papers (2021-09-25T17:27:48Z)
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation. Our study analyzes the most critical challenges when learning from offline human data. We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z)
Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills [93.12417203541948]
We propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset. We find that our method can operate on high-dimensional camera images and learn a variety of skills on real robots that generalize to previously unseen scenes and objects.
arXiv Detail & Related papers (2021-04-15T20:10:11Z)
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits. In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.