Exploring the Potential of Feature Density in Estimating Machine
Learning Classifier Performance with Application to Cyberbullying Detection
- URL: http://arxiv.org/abs/2206.01949v1
- Date: Sat, 4 Jun 2022 09:11:13 GMT
- Title: Exploring the Potential of Feature Density in Estimating Machine
Learning Classifier Performance with Application to Cyberbullying Detection
- Authors: Juuso Eronen, Michal Ptaszynski, Fumito Masui, Gniewosz Leliwa and
Michal Wroczynski
- Abstract summary: We analyze the potential of Feature Density (HD) as a way to comparatively estimate machine learning (ML) classifier performance prior to training.
Our approach 1s to optimize the resource-intensive training of ML models for Natural Language Processing to reduce the number of required experiments.
- Score: 2.4674086273775035
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this research. we analyze the potential of Feature Density (HD) as a way
to comparatively estimate machine learning (ML) classifier performance prior to
training. The goal of the study is to aid in solving the problem of
resource-intensive training of ML models which is becoming a serious issue due
to continuously increasing dataset sizes and the ever rising popularity of Deep
Neural Networks (DNN). The issue of constantly increasing demands for more
powerful computational resources is also affecting the environment, as training
large-scale ML models are causing alarmingly-growing amounts of CO2, emissions.
Our approach 1s to optimize the resource-intensive training of ML models for
Natural Language Processing to reduce the number of required experiments
iterations. We expand on previous attempts on improving classifier training
efficiency with FD while also providing an insight to the effectiveness of
various linguistically-backed feature preprocessing methods for dialog
classification, specifically cyberbullying detection.
Related papers
- Computational Bottlenecks of Training Small-scale Large Language Models [19.663560481459164]
Small-scale large Language Models (SLMs) are gaining attention due to cost and efficiency demands from consumers.
In this study, we explore the computational bottlenecks of training SLMs.
We assess these factors on popular cloud services using metrics such as loss per dollar and tokens per second.
arXiv Detail & Related papers (2024-10-25T10:30:21Z) - Extending Network Intrusion Detection with Enhanced Particle Swarm Optimization Techniques [0.0]
The present research investigates how to improve Network Intrusion Detection Systems (NIDS) by combining Machine Learning (ML) and Deep Learning (DL) techniques.
The study uses the CSE-CIC-IDS 2018 and LITNET-2020 datasets to compare ML methods (Decision Trees, Random Forest, XGBoost) and DL models (CNNs, RNNs, DNNs) against key performance metrics.
The Decision Tree model performed better across all measures after being fine-tuned with Enhanced Particle Swarm Optimization (EPSO), demonstrating the model's ability to detect network breaches effectively.
arXiv Detail & Related papers (2024-08-14T17:11:36Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - A Cohesive Distillation Architecture for Neural Language Models [0.0]
A recent trend in Natural Language Processing is the exponential growth in Language Model (LM) size.
This study investigates methods for Knowledge Distillation (KD) to provide efficient alternatives to large-scale models.
arXiv Detail & Related papers (2023-01-12T08:01:53Z) - Initial Study into Application of Feature Density and
Linguistically-backed Embedding to Improve Machine Learning-based
Cyberbullying Detection [54.83707803301847]
The research was conducted on a Formspring dataset provided in a Kaggle competition on automatic cyberbullying detection.
The study confirmed the effectiveness of Neural Networks in cyberbullying detection and the correlation between classifier performance and Feature Density.
arXiv Detail & Related papers (2022-06-04T03:17:15Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.