Student Assessment in Cybersecurity Training Automated by Pattern Mining
and Clustering
- URL: http://arxiv.org/abs/2307.10260v1
- Date: Thu, 13 Jul 2023 18:52:58 GMT
- Title: Student Assessment in Cybersecurity Training Automated by Pattern Mining
and Clustering
- Authors: Valdemar \v{S}v\'abensk\'y, Jan Vykopal, Pavel \v{C}eleda, Kristi\'an
Tk\'a\v{c}ik, Daniel Popovi\v{c}
- Abstract summary: This paper explores a dataset from 18 cybersecurity training sessions using data mining and machine learning techniques.
We employed pattern mining and clustering to analyze 8834 commands collected from 113 trainees.
Our results show that data mining methods are suitable for analyzing cybersecurity training data.
- Score: 0.5249805590164902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hands-on cybersecurity training allows students and professionals to practice
various tools and improve their technical skills. The training occurs in an
interactive learning environment that enables completing sophisticated tasks in
full-fledged operating systems, networks, and applications. During the
training, the learning environment allows collecting data about trainees'
interactions with the environment, such as their usage of command-line tools.
These data contain patterns indicative of trainees' learning processes, and
revealing them allows to assess the trainees and provide feedback to help them
learn. However, automated analysis of these data is challenging. The training
tasks feature complex problem-solving, and many different solution approaches
are possible. Moreover, the trainees generate vast amounts of interaction data.
This paper explores a dataset from 18 cybersecurity training sessions using
data mining and machine learning techniques. We employed pattern mining and
clustering to analyze 8834 commands collected from 113 trainees, revealing
their typical behavior, mistakes, solution strategies, and difficult training
stages. Pattern mining proved suitable in capturing timing information and tool
usage frequency. Clustering underlined that many trainees often face the same
issues, which can be addressed by targeted scaffolding. Our results show that
data mining methods are suitable for analyzing cybersecurity training data.
Educational researchers and practitioners can apply these methods in their
contexts to assess trainees, support them, and improve the training design.
Artifacts associated with this research are publicly available.
Related papers
- Automatic Identification and Visualization of Group Training Activities Using Wearable Data [7.130450173185638]
Human Activity Recognition (HAR) identifies daily activities from time-series data collected by wearable devices like smartwatches.
This paper presents a comprehensive framework for imputing, analyzing, and identifying activities from wearable data.
Our approach is based on data collected from 135 soldiers wearing Garmin 55 smartwatches over six months.
arXiv Detail & Related papers (2024-10-07T19:35:15Z) - Detecting Unsuccessful Students in Cybersecurity Exercises in Two Different Learning Environments [0.37729165787434493]
This paper develops automated tools to predict when a student is having difficulty.
In a potential application, such models can aid instructors in detecting struggling students and providing targeted help.
arXiv Detail & Related papers (2024-08-16T04:57:54Z) - Deep Internal Learning: Deep Learning from a Single Input [88.59966585422914]
In many cases there is value in training a network just from the input at hand.
This is particularly relevant in many signal and image processing problems where training data is scarce and diversity is large.
This survey paper aims at covering deep internal-learning techniques that have been proposed in the past few years for these two important directions.
arXiv Detail & Related papers (2023-12-12T16:48:53Z) - Benchmarking Offline Reinforcement Learning on Real-Robot Hardware [35.29390454207064]
Dexterous manipulation in particular remains an open problem in its general form.
We propose a benchmark including a large collection of data for offline learning from a dexterous manipulation platform on two tasks.
We evaluate prominent open-sourced offline reinforcement learning algorithms on the datasets and provide a reproducible experimental setup for offline reinforcement learning on real systems.
arXiv Detail & Related papers (2023-07-28T17:29:49Z) - Applications of Educational Data Mining and Learning Analytics on Data
From Cybersecurity Training [0.5735035463793008]
This paper surveys publications that enhance cybersecurity education by leveraging trainee-generated data from interactive learning environments.
We identified and examined 3021 papers, ultimately selecting 35 articles for a detailed review.
Our contribution is a systematic literature review of relevant papers and their categorization according to the collected data, analysis methods, and application contexts.
arXiv Detail & Related papers (2023-07-13T19:05:17Z) - Toolset for Collecting Shell Commands and Its Application in Hands-on
Cybersecurity Training [0.5735035463793008]
We share the design and implementation of an open-source toolset for logging commands that students execute on Linux machines.
Compared to basic solutions, such as shell history files, the toolset's added value is threefold.
Data are instantly forwarded to central storage in a unified, semi-structured format.
arXiv Detail & Related papers (2021-12-21T11:45:13Z) - Understanding the World Through Action [91.3755431537592]
I will argue that a general, principled, and powerful framework for utilizing unlabeled data can be derived from reinforcement learning.
I will discuss how such a procedure is more closely aligned with potential downstream tasks.
arXiv Detail & Related papers (2021-10-24T22:33:52Z) - Motivating Learners in Multi-Orchestrator Mobile Edge Learning: A
Stackelberg Game Approach [54.28419430315478]
Mobile Edge Learning enables distributed training of Machine Learning models over heterogeneous edge devices.
In MEL, the training performance deteriorates without the availability of sufficient training data or computing resources.
We propose an incentive mechanism, where we formulate the orchestrators-learners interactions as a 2-round Stackelberg game.
arXiv Detail & Related papers (2021-09-25T17:27:48Z) - What Matters in Learning from Offline Human Demonstrations for Robot
Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation.
Our study analyzes the most critical challenges when learning from offline human data.
We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z) - Actionable Models: Unsupervised Offline Reinforcement Learning of
Robotic Skills [93.12417203541948]
We propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset.
We find that our method can operate on high-dimensional camera images and learn a variety of skills on real robots that generalize to previously unseen scenes and objects.
arXiv Detail & Related papers (2021-04-15T20:10:11Z) - Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits.
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.