MUTEX: Leveraging Multilingual Transformers and Conditional Random Fields for Enhanced Urdu Toxic Span Detection
- URL: http://arxiv.org/abs/2603.05057v1
- Date: Thu, 05 Mar 2026 11:11:50 GMT
- Title: MUTEX: Leveraging Multilingual Transformers and Conditional Random Fields for Enhanced Urdu Toxic Span Detection
- Authors: Inayat Arshad, Fajar Saleem, Ijaz Hussain,
- Abstract summary: MUTEX is a multilingual transformer combined with conditional random fields (CRF) for Urdu toxic span detection framework.<n> MUTEX achieves 60% token-level F1 score that is the first supervised baseline for Urdu toxic span detection.
- Score: 0.41292255339309664
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Urdu toxic span detection remains limited because most existing systems rely on sentence-level classification and fail to identify the specific toxic spans within those text. It is further exacerbated by the multiple factors i.e. lack of token-level annotated resources, linguistic complexity of Urdu, frequent code-switching, informal expressions, and rich morphological variations. In this research, we propose MUTEX: a multilingual transformer combined with conditional random fields (CRF) for Urdu toxic span detection framework that uses manually annotated token-level toxic span dataset to improve performance and interpretability. MUTEX uses XLM RoBERTa with CRF layer to perform sequence labeling and is tested on multi-domain data extracted from social media, online news, and YouTube reviews using token-level F1 to evaluate fine-grained span detection. The results indicate that MUTEX achieves 60% token-level F1 score that is the first supervised baseline for Urdu toxic span detection. Further examination reveals that transformer-based models are more effective at implicitly capturing the contextual toxicity and are able to address the issues of code-switching and morphological variation than other models.
Related papers
- Unveiling Covert Toxicity in Multimodal Data via Toxicity Association Graphs: A Graph-Based Metric and Interpretable Detection Framework [58.01529356381494]
We propose a novel detection framework based on Toxicity Association Graphs (TAGs)<n>We introduce the first quantifiable metric for hidden toxicity, the Multimodal Toxicity Covertness (MTC)<n>Our approach enables precise identification of covert toxicity while preserving full interpretability of the decision-making process.
arXiv Detail & Related papers (2026-02-03T08:54:25Z) - Beyond Literal Mapping: Benchmarking and Improving Non-Literal Translation Evaluation [57.11989521509119]
We propose a novel agentic translation evaluation framework, centered by a reflective Core Agent that invokes specialized sub-agents.<n> Experimental results indicate the efficacy of RATE, achieving an improvement of at least 3.2 meta score compared with current metrics.
arXiv Detail & Related papers (2026-01-12T09:03:42Z) - Text Detoxification in isiXhosa and Yorùbá: A Cross-Lingual Machine Learning Approach for Low-Resource African Languages [0.0]
Toxic language is one of the major barrier to safe online participation, yet robust mitigation tools are scarce for African languages.<n>This study investigates automatic text detoxification (toxic to neutral rewriting) for two low-resource African languages, isiXhosa and Yorb.
arXiv Detail & Related papers (2026-01-09T08:28:58Z) - Rethinking Toxicity Evaluation in Large Language Models: A Multi-Label Perspective [104.09817371557476]
Large language models (LLMs) have achieved impressive results across a range of natural language processing tasks.<n>Their potential to generate harmful content has raised serious safety concerns.<n>We introduce three novel multi-label benchmarks for toxicity detection.
arXiv Detail & Related papers (2025-10-16T06:50:33Z) - Enhancing Robustness of Autoregressive Language Models against Orthographic Attacks via Pixel-based Approach [51.95266411355865]
Autoregressive language models are vulnerable to orthographic attacks.<n>This vulnerability stems from the out-of-vocabulary issue inherent in subword tokenizers and their embeddings.<n>We propose a pixel-based generative language model that replaces the text-based embeddings with pixel-based representations by rendering words as individual images.
arXiv Detail & Related papers (2025-08-28T20:48:38Z) - Anomaly Detection in Human Language via Meta-Learning: A Few-Shot Approach [0.0]
We propose a framework for detecting anomalies in human language across diverse domains with limited labeled data.<n>We treat anomaly detection as a few shot binary classification problem and leverage meta-learning to train models that generalize across tasks.<n>Our method combines episodic training with prototypical networks and domain resampling to adapt quickly to new anomaly detection tasks.
arXiv Detail & Related papers (2025-07-26T17:23:03Z) - Exploring Gradient-Guided Masked Language Model to Detect Textual Adversarial Attacks [50.53590930588431]
adversarial examples pose serious threats to natural language processing systems.<n>Recent studies suggest that adversarial texts deviate from the underlying manifold of normal texts, whereas masked language models can approximate the manifold of normal data.<n>We first introduce Masked Language Model-based Detection (MLMD), leveraging mask unmask operations of the masked language modeling (MLM) objective.
arXiv Detail & Related papers (2025-04-08T14:10:57Z) - MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector [10.37639482435147]
We introduce MuTox, the first highly multilingual audio-based dataset with toxicity labels.
The dataset comprises 20,000 audio utterances for English and Spanish, and 4,000 for the other 19 languages.
arXiv Detail & Related papers (2024-01-10T10:37:45Z) - Experiments with adversarial attacks on text genres [0.0]
Neural models based on pre-trained transformers, such as BERT or XLM-RoBERTa, demonstrate SOTA results in many NLP tasks.
We show that embedding-based algorithms which can replace some of the most significant'' words with words similar to them, have the ability to influence model predictions in a significant proportion of cases.
arXiv Detail & Related papers (2021-07-05T19:37:59Z) - WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for
Detecting Toxic Spans [2.4737119633827174]
In recent years, the widespread use of social media has led to an increase in the generation of toxic and offensive content on online platforms.
Social media platforms have worked on developing automatic detection methods and employing human moderators to cope with this deluge of offensive content.
arXiv Detail & Related papers (2021-04-09T22:52:26Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.