Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and
URL Data Analysis
- URL: http://arxiv.org/abs/2004.04812v2
- Date: Sat, 17 Oct 2020 08:12:19 GMT
- Title: Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and
URL Data Analysis
- Authors: Simran K, Prathiksha Balakrishna, Vinayakumar Ravi, Soman KP
- Abstract summary: In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated.
Various experiments were performed using cost-insensitive as well as cost-sensitive methods.
In all experiments, the cost-sensitive deep learning methods performed better than the cost-insensitive approaches.
- Score: 2.2901908285413413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning is a state of the art method for a lot of applications. The
main issue is that most of the real-time data is highly imbalanced in nature.
In order to avoid bias in training, cost-sensitive approach can be used. In
this paper, we propose cost-sensitive deep learning based frameworks and the
performance of the frameworks is evaluated on three different Cyber Security
use cases which are Domain Generation Algorithm (DGA), Electronic mail (Email),
and Uniform Resource Locator (URL). Various experiments were performed using
cost-insensitive as well as cost-sensitive methods and parameters for both of
these methods are set based on hyperparameter tuning. In all experiments, the
cost-sensitive deep learning methods performed better than the cost-insensitive
approaches. This is mainly due to the reason that cost-sensitive approach gives
importance to the classes which have a very less number of samples during
training and this helps to learn all the classes in a more efficient manner.
Related papers
- Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning [70.22819290458581]
Reinforcement learning with human feedback (RLHF) is a widely adopted approach in current large language model pipelines.
Our approach introduces two key innovations: (1) on-policy query to avoid OOD and imbalance issues in seed data, and (2) active learning to select the most informative data for preference queries.
arXiv Detail & Related papers (2024-07-02T10:09:19Z) - Provably Robust Cost-Sensitive Learning via Randomized Smoothing [21.698527267902158]
We investigate whether randomized smoothing, a scalable framework for robustness certification, can be leveraged to certify and train for cost-sensitive robustness.
We first illustrate how to adapt the standard certification algorithm of randomized smoothing to produce tight robustness certificates for any binary cost matrix.
We then develop a robust training method to promote certified cost-sensitive robustness while maintaining the model's overall accuracy.
arXiv Detail & Related papers (2023-10-12T21:39:16Z) - Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$.
Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z) - Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online.
We extensively ablate these design choices, demonstrating the key factors that most affect performance.
We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z) - Cost-Sensitive Stacking: an Empirical Evaluation [3.867363075280544]
Cost-sensitive learning adapts classification algorithms to account for differences in misclassification costs.
There is no consensus in the literature as to what cost-sensitive stacking is.
Our experiments, conducted on twelve datasets, show that for best performance, both levels of stacking require cost-sensitive classification decision.
arXiv Detail & Related papers (2023-01-04T18:28:07Z) - Online Active Learning for Soft Sensor Development using Semi-Supervised
Autoencoders [0.7734726150561089]
Data-driven soft sensors are extensively used in industrial and chemical processes to predict hard-to-measure process variables.
Active learning methods can be highly beneficial as they can suggest the most informative labels to query.
In this work, we adapt some of these approaches to the stream-based scenario and show how they can be used to select the most informative data points.
arXiv Detail & Related papers (2022-12-26T09:45:41Z) - Robust Few-shot Learning Without Using any Adversarial Samples [19.34427461937382]
A few efforts have been made to combine the few-shot problem with the robustness objective using sophisticated Meta-Learning techniques.
We propose a simple but effective alternative that does not require any adversarial samples.
Inspired by the cognitive decision-making process in humans, we enforce high-level feature matching between the base class data and their corresponding low-frequency samples.
arXiv Detail & Related papers (2022-11-03T05:58:26Z) - Rethinking Cost-sensitive Classification in Deep Learning via
Adversarial Data Augmentation [4.479834103607382]
Cost-sensitive classification is critical in applications where misclassification errors widely vary in cost.
This paper proposes a cost-sensitive adversarial data augmentation framework to make over- parameterized models cost-sensitive.
Our method can effectively minimize the overall cost and reduce critical errors, while achieving comparable performance in terms of overall accuracy.
arXiv Detail & Related papers (2022-08-24T19:00:30Z) - An Experimental Design Perspective on Model-Based Reinforcement Learning [73.37942845983417]
In practical applications of RL, it is expensive to observe state transitions from the environment.
We propose an acquisition function that quantifies how much information a state-action pair would provide about the optimal solution to a Markov decision process.
arXiv Detail & Related papers (2021-12-09T23:13:57Z) - A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via
Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy.
Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.