Related papers: Efficient Hate Speech Detection: A Three-Layer LoRA-Tuned BERTweet Framework

Efficient Hate Speech Detection: A Three-Layer LoRA-Tuned BERTweet Framework

URL: http://arxiv.org/abs/2511.06051v1
Date: Sat, 08 Nov 2025 15:47:18 GMT
Title: Efficient Hate Speech Detection: A Three-Layer LoRA-Tuned BERTweet Framework
Authors: Mahmoud El-Bahnasawi,
Abstract summary: This paper addresses the challenge of developing computationally efficient hate speech detection systems.<n>We propose a novel three-layer framework that combines rule-based pre-filtering with a parameter-efficient LoRA-tuned BERTweet model.<n>Our approach achieves 94% of the performance of state-of-the-art large language models like SafePhi.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses the critical challenge of developing computationally efficient hate speech detection systems that maintain competitive performance while being practical for real-time deployment. We propose a novel three-layer framework that combines rule-based pre-filtering with a parameter-efficient LoRA-tuned BERTweet model and continuous learning capabilities. Our approach achieves 0.85 macro F1 score - representing 94% of the performance of state-of-the-art large language models like SafePhi (Phi-4 based) while using a base model that is 100x smaller (134M vs 14B parameters). Compared to traditional BERT-based approaches with similar computational requirements, our method demonstrates superior performance through strategic dataset unification and optimized fine-tuning. The system requires only 1.87M trainable parameters (1.37% of full fine-tuning) and trains in approximately 2 hours on a single T4 GPU, making robust hate speech detection accessible in resource-constrained environments while maintaining competitive accuracy for real-world deployment.

Related papers

Chain of Simulation: A Dual-Mode Reasoning Framework for Large Language Models with Dynamic Problem Routing [0.0]
Chain of Simulation (CoS) is a novel dual-mode reasoning framework that dynamically routes problems to specialized reasoning strategies.<n>CoS employs three distinct reasoning modes: computational flow with self-consistency for mathematical problems, symbolic state tracking with representations for spatial reasoning, and hybrid fact-extraction for multi-hop inference.
arXiv Detail & Related papers (2026-02-02T21:44:01Z)
Mecellem Models: Turkish Models Trained from Scratch and Continually Pre-trained for the Legal Domain [0.0]
This paper presents Mecellem models, a framework for developing specialized language models for the Turkish legal domain.<n>We make two contributions: (1)Encoder Model Pre-trained from Scratch: ModernBERT-based bidirectional encoders pre-trained on a Turkish-dominant corpus of 112.7 billion tokens; and (2)Decoder Model with Continual Pre-training (CPT): Qwen3-1.7B and Qwen3-4B models adapted to Turkish legal domain through controlled curriculum learning.
arXiv Detail & Related papers (2026-01-22T14:41:32Z)
Teaching Language Models to Reason with Tools [73.21700643314917]
We present emphHint-Engineering, a new data synthesis strategy that strategically injects diverse hints at optimal points within reasoning paths.<n>CoRT significantly enhances efficiency, reducing token usage by approximately 30% for the 32B model and 50% for the 1.5B model.
arXiv Detail & Related papers (2025-10-23T08:41:44Z)
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model [100.86587937568832]
Ring-1T is the first open-source, state-of-the-art thinking model with a trillion-scale parameter.<n>It features 1 trillion total parameters and activates approximately 50 billion per token.
arXiv Detail & Related papers (2025-10-21T17:46:14Z)
Enhancing Speech Emotion Recognition via Fine-Tuning Pre-Trained Models and Hyper-Parameter Optimisation [3.313347968067735]
We propose a workflow for speech emotion recognition using pre-trained representations and HPO strategies.<n>Experiments run on 8 CPU cores with 32 GB RAM.<n>For cross-lingual generalisation, an EmoDB-trained HPO-tuned model improves zero-shot accuracy by 0.25 on CREMA-D and 0.26 on RAVDESS.
arXiv Detail & Related papers (2025-10-08T14:20:43Z)
Systematic Optimization of Open Source Large Language Models for Mathematical Reasoning [1.8254074486719114]
This paper presents a practical investigation into fine-tuning model parameters for mathematical reasoning tasks.<n>A holistically optimized framework is introduced for five state-of-the-art models on mathematical reasoning tasks.
arXiv Detail & Related papers (2025-09-08T21:31:43Z)
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs [51.21041884010009]
Ring-lite is a Mixture-of-Experts (MoE)-based large language model optimized via reinforcement learning (RL)<n>Our approach matches the performance of state-of-the-art (SOTA) small-scale reasoning models on challenging benchmarks.
arXiv Detail & Related papers (2025-06-17T17:12:34Z)
EfficientLLM: Efficiency in Large Language Models [64.3537131208038]
Large Language Models (LLMs) have driven significant progress, yet their growing counts and context windows incur prohibitive compute, energy, and monetary costs.<n>We introduce EfficientLLM, a novel benchmark and the first comprehensive empirical study evaluating efficiency techniques for LLMs at scale.
arXiv Detail & Related papers (2025-05-20T02:27:08Z)
Learning Adaptive Parallel Reasoning with Language Models [70.1745752819628]
We propose Adaptive Parallel Reasoning (APR), a novel reasoning framework that enables language models to orchestrate both serialized and parallel computations end-to-end.<n> APR generalizes existing reasoning methods by enabling adaptive multi-threaded inference using spawn() and join() operations.<n>A key innovation is our end-to-end reinforcement learning strategy, optimizing both parent and child inference threads to enhance task success rate without requiring predefined reasoning structures.
arXiv Detail & Related papers (2025-04-21T22:29:02Z)
The Surprising Effectiveness of Test-Time Training for Few-Shot Learning [59.309477460893916]
Language models (LMs) have shown impressive performance on tasks within their training distribution, but often struggle with structurally novel tasks.<n>We investigate the effectiveness of test-time training (TTT) as a mechanism for improving LMs' reasoning and few-shot learning capabilities.<n>Our findings highlight the limitations of in-context learning for novel tasks and demonstrate the potential of test-time training to enhance language model adaptability.
arXiv Detail & Related papers (2024-11-11T18:59:45Z)
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames [55.72994484532856]
temporal action detection (TAD) has seen significant performance improvement with end-to-end training. Due to the memory bottleneck, only models with limited scales and limited data volumes can afford end-to-end training. We reduce the memory consumption for end-to-end training, and manage to scale up the TAD backbone to 1 billion parameters and the input video to 1,536 frames.
arXiv Detail & Related papers (2023-11-28T21:31:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.