Related papers: ylmmcl at Multilingual Text Detoxification 2025: Lexicon-Guided Detoxification and Classifier-Gated Rewriting

ylmmcl at Multilingual Text Detoxification 2025: Lexicon-Guided Detoxification and Classifier-Gated Rewriting

URL: http://arxiv.org/abs/2507.18769v1
Date: Thu, 24 Jul 2025 19:38:15 GMT
Title: ylmmcl at Multilingual Text Detoxification 2025: Lexicon-Guided Detoxification and Classifier-Gated Rewriting
Authors: Nicole Lai-Lopez, Lusha Wang, Su Yuan, Liza Zhang,
Abstract summary: In this work, we introduce our solution for the Multilingual Text Detoxification Task in the PAN-2025 competition for the ylmmcl team.<n>Our approach departs from prior unsupervised or monolingual pipelines by leveraging explicit toxic word annotation via the multilingual_toxic_lexicon.<n>Our model achieves the highest STA (0.922) from our previous attempts, and an average official J score of 0.612 for toxic inputs in both the development and test sets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we introduce our solution for the Multilingual Text Detoxification Task in the PAN-2025 competition for the ylmmcl team: a robust multilingual text detoxification pipeline that integrates lexicon-guided tagging, a fine-tuned sequence-to-sequence model (s-nlp/mt0-xl-detox-orpo) and an iterative classifier-based gatekeeping mechanism. Our approach departs from prior unsupervised or monolingual pipelines by leveraging explicit toxic word annotation via the multilingual_toxic_lexicon to guide detoxification with greater precision and cross-lingual generalization. Our final model achieves the highest STA (0.922) from our previous attempts, and an average official J score of 0.612 for toxic inputs in both the development and test sets. It also achieved xCOMET scores of 0.793 (dev) and 0.787 (test). This performance outperforms baseline and backtranslation methods across multiple languages, and shows strong generalization in high-resource settings (English, Russian, French). Despite some trade-offs in SIM, the model demonstrates consistent improvements in detoxification strength. In the competition, our team achieved ninth place with a score of 0.612.

Related papers

Breaking mBad! Supervised Fine-tuning for Cross-Lingual Detoxification [31.7516400680833]
"Cross-lingual Detoxification" is a paradigm that mitigates toxicity in large language models.<n>We analyze toxicity reduction in cross-distribution settings and investigate how mitigation impacts model performance on non-toxic tasks.
arXiv Detail & Related papers (2025-05-22T14:30:14Z)
Multilingual and Explainable Text Detoxification with Parallel Corpora [58.83211571400692]
We extend parallel text detoxification corpus to new languages.<n>We conduct the first of its kind an automated, explainable analysis of the descriptive features of both toxic and non-toxic sentences.<n>We then experiment with a novel text detoxification method inspired by the Chain-of-Thoughts reasoning approach.
arXiv Detail & Related papers (2024-12-16T12:08:59Z)
SmurfCat at PAN 2024 TextDetox: Alignment of Multilingual Transformers for Text Detoxification [41.94295877935867]
This paper presents a solution for the Multilingual Text Detoxification task in the PAN-2024 competition of the SmurfCat team. Using data augmentation through machine translation and a special filtering procedure, we collected an additional multilingual parallel dataset for text detoxification. We fine-tuned several multilingual sequence-to-sequence models, such as mT0 and Aya, on a text detoxification task.
arXiv Detail & Related papers (2024-07-07T17:19:34Z)
PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models [27.996123856250065]
Existing toxicity benchmarks are overwhelmingly focused on English. We introduce PolygloToxicityPrompts (PTP), the first large-scale multilingual toxicity evaluation benchmark of 425K naturally occurring prompts spanning 17 languages.
arXiv Detail & Related papers (2024-05-15T14:22:33Z)
MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages [71.50809576484288]
Text detoxification is a task where a text is paraphrased from a toxic surface form, e.g. featuring rude words, to the neutral register. Recent approaches for parallel text detoxification corpora collection -- ParaDetox and APPADIA -- were explored only in monolingual setup. In this work, we aim to extend ParaDetox pipeline to multiple languages presenting MultiParaDetox to automate parallel detoxification corpus collection for potentially any language.
arXiv Detail & Related papers (2024-04-02T15:32:32Z)
Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text Detoxification [77.45995868988301]
Text detoxification is the task of transferring the style of text from toxic to neutral. We present a large-scale study of strategies for cross-lingual text detoxification.
arXiv Detail & Related papers (2023-11-23T11:40:28Z)
Exploring Cross-lingual Textual Style Transfer with Large Multilingual Language Models [78.12943085697283]
Detoxification is a task of generating text in polite style while preserving meaning and fluency of the original toxic text. This work investigates multilingual and cross-lingual detoxification and the behavior of large multilingual models like in this setting.
arXiv Detail & Related papers (2022-06-05T20:02:30Z)
Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of Code-Mixed Clinical Texts [56.72488923420374]
Pre-trained language models (LMs) have shown great potential for cross-lingual transfer in low-resource settings. We show the few-shot cross-lingual transfer property of LMs for named recognition (NER) and apply it to solve a low-resource and real-world challenge of code-mixed (Spanish-Catalan) clinical notes de-identification in the stroke.
arXiv Detail & Related papers (2022-04-10T21:46:52Z)
AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models [0.0]
We implement an accurate model to detect xenophobia in comments about web news articles. We obtained the 3rd place in Task 1 official ranking with the F1-score of 0.5996, and we achieved the 6th place in Task 2 official ranking with the CEM of 0.7142. Our results suggest: (i) BERT models obtain better results than statistical models for toxicity detection in text comments; (ii) Monolingual BERT models have an advantage over multilingual BERT models in toxicity detection in text comments in their pre-trained language.
arXiv Detail & Related papers (2021-11-08T14:24:21Z)
AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages [75.08199398141744]
We present AmericasNLI, an extension of XNLI (Conneau et al.), to 10 indigenous languages of the Americas. We conduct experiments with XLM-R, testing multiple zero-shot and translation-based approaches. We find that XLM-R's zero-shot performance is poor for all 10 languages, with an average performance of 38.62%.
arXiv Detail & Related papers (2021-04-18T05:32:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.