Related papers: Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for Specialized Cyber Threat Intelligence

Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for Specialized Cyber Threat Intelligence

URL: http://arxiv.org/abs/2207.11076v1
Date: Fri, 22 Jul 2022 13:34:28 GMT
Title: Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for Specialized Cyber Threat Intelligence
Authors: Markus Bayer, Tobias Frey, Christian Reuter
Abstract summary: We propose a system to train a new classifier for each new incident. This requires a lot of labelled data using standard training methods. We evaluate our approach using a novel dataset derived from the Microsoft Exchange Server data breach of 2021.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gathering cyber threat intelligence from open sources is becoming increasingly important for maintaining and achieving a high level of security as systems become larger and more complex. However, these open sources are often subject to information overload. It is therefore useful to apply machine learning models that condense the amount of information to what is necessary. Yet, previous studies and applications have shown that existing classifiers are not able to extract specific information about emerging cybersecurity events due to their low generalization ability. Therefore, we propose a system to overcome this problem by training a new classifier for each new incident. Since this requires a lot of labelled data using standard training methods, we combine three different low-data regime techniques - transfer learning, data augmentation, and few-shot learning - to train a high-quality classifier from very few labelled instances. We evaluated our approach using a novel dataset derived from the Microsoft Exchange Server data breach of 2021 which was labelled by three experts. Our findings reveal an increase in F1 score of more than 21 points compared to standard training methods and more than 18 points compared to a state-of-the-art method in few-shot learning. Furthermore, the classifier trained with this method and 32 instances is only less than 5 F1 score points worse than a classifier trained with 1800 instances.

Related papers

Cyber-Attack Technique Classification Using Two-Stage Trained Large Language Models [5.713349305091325]
We present a sentence classification system that can identify the attack techniques described in natural language sentences from cyber threat intelligence (CTI) reports. We propose a new method for utilizing auxiliary data with the same labels to improve classification for the low-resource cyberattack classification task.
arXiv Detail & Related papers (2024-11-27T21:09:02Z)
Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge. We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks. Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z)
Text generation for dataset augmentation in security classification tasks [55.70844429868403]
This study evaluates the application of natural language text generators to fill this data gap in multiple security-related text classification tasks. We find substantial benefits for GPT-3 data augmentation strategies in situations with severe limitations on known positive-class samples.
arXiv Detail & Related papers (2023-10-22T22:25:14Z)
Few-shot Weakly-supervised Cybersecurity Anomaly Detection [1.179179628317559]
We propose an enhancement to an existing few-shot weakly-supervised deep learning anomaly detection framework. This framework incorporates data augmentation, representation learning and ordinal regression. We then evaluated and showed the performance of our implemented framework on three benchmark datasets.
arXiv Detail & Related papers (2023-04-15T04:37:54Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems [9.660129425150926]
We show that the existing cross entropy loss minimization problem essentially learns the label conditional entropy of the underlying data distribution. We propose a mutual information learning framework where we train deep neural network classifiers via learning the mutual information between the label and the input.
arXiv Detail & Related papers (2022-09-21T01:06:30Z)
Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems. We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z)
Long-tail Recognition via Compositional Knowledge Transfer [60.03764547406601]
We introduce a novel strategy for long-tail recognition that addresses the tail classes' few-shot problem. Our objective is to transfer knowledge acquired from information-rich common classes to semantically similar, and yet data-hungry, rare classes. Experiments show that our approach can achieve significant performance boosts on rare classes while maintaining robust common class performance.
arXiv Detail & Related papers (2021-12-13T15:48:59Z)
Subspace Regularizers for Few-Shot Class Incremental Learning [26.372024890126408]
We present a new family of subspace regularization schemes that encourage weight vectors for new classes to lie close to the subspace spanned by the weights of existing classes. Our results show that simple geometric regularization of class representations offers an effective tool for continual learning.
arXiv Detail & Related papers (2021-10-13T22:19:53Z)
No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data. We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model. Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z)
Few-Shot Incremental Learning with Continually Evolved Classifiers [46.278573301326276]
Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points. The difficulty lies in that limited data from new classes not only lead to significant overfitting issues but also exacerbate the notorious catastrophic forgetting problems. We propose a Continually Evolved CIF ( CEC) that employs a graph model to propagate context information between classifiers for adaptation.
arXiv Detail & Related papers (2021-04-07T10:54:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.