Related papers: Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis

Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis

URL: http://arxiv.org/abs/2004.04812v2
Date: Sat, 17 Oct 2020 08:12:19 GMT
Title: Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis
Authors: Simran K, Prathiksha Balakrishna, Vinayakumar Ravi, Soman KP
Abstract summary: In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated. Various experiments were performed using cost-insensitive as well as cost-sensitive methods. In all experiments, the cost-sensitive deep learning methods performed better than the cost-insensitive approaches.
Score: 2.2901908285413413
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning is a state of the art method for a lot of applications. The main issue is that most of the real-time data is highly imbalanced in nature. In order to avoid bias in training, cost-sensitive approach can be used. In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated on three different Cyber Security use cases which are Domain Generation Algorithm (DGA), Electronic mail (Email), and Uniform Resource Locator (URL). Various experiments were performed using cost-insensitive as well as cost-sensitive methods and parameters for both of these methods are set based on hyperparameter tuning. In all experiments, the cost-sensitive deep learning methods performed better than the cost-insensitive approaches. This is mainly due to the reason that cost-sensitive approach gives importance to the classes which have a very less number of samples during training and this helps to learn all the classes in a more efficient manner.

Related papers

Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning [70.22819290458581]
Reinforcement learning with human feedback (RLHF) is a widely adopted approach in current large language model pipelines. Our approach introduces two key innovations: (1) on-policy query to avoid OOD and imbalance issues in seed data, and (2) active learning to select the most informative data for preference queries.
arXiv Detail & Related papers (2024-07-02T10:09:19Z)
$\ abla τ$: Gradient-based and Task-Agnostic machine Unlearning [7.04736023670375]
We introduce Gradient-based and Task-Agnostic machine Unlearning ($nabla tau$) $nabla tau$ applies adaptive gradient ascent to the data to be forgotten while using standard gradient descent for the remaining data. We evaluate our framework's effectiveness using a set of well-established Membership Inference Attack metrics.
arXiv Detail & Related papers (2024-03-21T12:11:26Z)
Provably Robust Cost-Sensitive Learning via Randomized Smoothing [21.698527267902158]
We investigate whether randomized smoothing, a scalable framework for robustness certification, can be leveraged to certify and train for cost-sensitive robustness. We first illustrate how to adapt the standard certification algorithm of randomized smoothing to produce tight robustness certificates for any binary cost matrix. We then develop a robust training method to promote certified cost-sensitive robustness while maintaining the model's overall accuracy.
arXiv Detail & Related papers (2023-10-12T21:39:16Z)
Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$. Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z)
Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online. We extensively ablate these design choices, demonstrating the key factors that most affect performance. We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z)
Cost-Sensitive Stacking: an Empirical Evaluation [3.867363075280544]
Cost-sensitive learning adapts classification algorithms to account for differences in misclassification costs. There is no consensus in the literature as to what cost-sensitive stacking is. Our experiments, conducted on twelve datasets, show that for best performance, both levels of stacking require cost-sensitive classification decision.
arXiv Detail & Related papers (2023-01-04T18:28:07Z)
Online Active Learning for Soft Sensor Development using Semi-Supervised Autoencoders [0.7734726150561089]
Data-driven soft sensors are extensively used in industrial and chemical processes to predict hard-to-measure process variables. Active learning methods can be highly beneficial as they can suggest the most informative labels to query. In this work, we adapt some of these approaches to the stream-based scenario and show how they can be used to select the most informative data points.
arXiv Detail & Related papers (2022-12-26T09:45:41Z)
Robust Few-shot Learning Without Using any Adversarial Samples [19.34427461937382]
A few efforts have been made to combine the few-shot problem with the robustness objective using sophisticated Meta-Learning techniques. We propose a simple but effective alternative that does not require any adversarial samples. Inspired by the cognitive decision-making process in humans, we enforce high-level feature matching between the base class data and their corresponding low-frequency samples.
arXiv Detail & Related papers (2022-11-03T05:58:26Z)
Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation [4.479834103607382]
Cost-sensitive classification is critical in applications where misclassification errors widely vary in cost. This paper proposes a cost-sensitive adversarial data augmentation framework to make over- parameterized models cost-sensitive. Our method can effectively minimize the overall cost and reduce critical errors, while achieving comparable performance in terms of overall accuracy.
arXiv Detail & Related papers (2022-08-24T19:00:30Z)
An Experimental Design Perspective on Model-Based Reinforcement Learning [73.37942845983417]
In practical applications of RL, it is expensive to observe state transitions from the environment. We propose an acquisition function that quantifies how much information a state-action pair would provide about the optimal solution to a Markov decision process.
arXiv Detail & Related papers (2021-12-09T23:13:57Z)
A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy. Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z)
An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems. Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.