LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text across English and Multilingual Contexts
- URL: http://arxiv.org/abs/2501.11914v1
- Date: Tue, 21 Jan 2025 06:32:32 GMT
- Title: LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text across English and Multilingual Contexts
- Authors: Md Kamrujjaman Mobin, Md Saiful Islam,
- Abstract summary: This paper presents a system developed for Task 1 of the COLING 2025 Workshop on Detecting AI-Generated Content.
Our approach utilizes an ensemble of models, with weights assigned according to each model's inverse perplexity, to enhance classification accuracy.
Our results demonstrate the effectiveness of inverse perplexity weighting in improving the robustness of machine-generated text detection across both monolingual and multilingual settings.
- Score: 0.8495482945981923
- License:
- Abstract: This paper presents a system developed for Task 1 of the COLING 2025 Workshop on Detecting AI-Generated Content, focusing on the binary classification of machine-generated versus human-written text. Our approach utilizes an ensemble of models, with weights assigned according to each model's inverse perplexity, to enhance classification accuracy. For the English text detection task, we combined RoBERTa-base, RoBERTa-base with the OpenAI detector, and BERT-base-cased, achieving a Macro F1-score of 0.7458, which ranked us 12th out of 35 teams. We ensembled RemBERT, XLM-RoBERTa-base, and BERT-base-multilingual-case for the multilingual text detection task, employing the same inverse perplexity weighting technique. This resulted in a Macro F1-score of 0.7513, positioning us 4th out of 25 teams. Our results demonstrate the effectiveness of inverse perplexity weighting in improving the robustness of machine-generated text detection across both monolingual and multilingual settings, highlighting the potential of ensemble methods for this challenging task.
Related papers
- AI-generated Text Detection with a GLTR-based Approach [0.5524804393257919]
Giant Language Model Test Room is a visual tool designed to help detect machine-generated texts based on GPT-2.
One limitation of GLTR is that the results it returns can sometimes be ambiguous and lead to confusion.
This study aims to explore various ways to improve GLTR's effectiveness for detecting AI-generated texts within the context of the IberLef-AuTexTification 2023 shared task.
arXiv Detail & Related papers (2025-02-17T17:32:55Z) - GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human [71.42669028683741]
We present a shared task on binary machine generated text detection conducted as a part of the GenAI workshop at COLING 2025.
The task consists of two subtasks: Monolingual (English) and Multilingual.
We provide a comprehensive overview of the data, a summary of the results, detailed descriptions of the participating systems, and an in-depth analysis of submissions.
arXiv Detail & Related papers (2025-01-19T11:11:55Z) - HYBRINFOX at CheckThat! 2024 -- Task 2: Enriching BERT Models with the Expert System VAGO for Subjectivity Detection [0.8083061106940517]
The HYBRINFOX method ranked 1st with a macro F1 score of 0.7442 on the evaluation data.
We explain the principles of our hybrid approach, and outline ways in which the method could be improved for other languages besides English.
arXiv Detail & Related papers (2024-07-04T09:29:19Z) - Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text [7.959800630494841]
SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection.
We propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human.
Our architecture ranked 46th on the official leaderboard with an accuracy of 80.83 among 125.
arXiv Detail & Related papers (2024-07-03T10:22:23Z) - Generative AI Text Classification using Ensemble LLM Approaches [0.12483023446237698]
Large Language Models (LLMs) have shown impressive performance across a variety of AI and natural language processing tasks.
We propose an ensemble neural model that generates probabilities from different pre-trained LLMs.
For the first task of distinguishing between AI and human generated text, our model ranked in fifth and thirteenth place.
arXiv Detail & Related papers (2023-09-14T14:41:46Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z) - DIALOG-22 RuATD Generated Text Detection [0.0]
Detectors that can distinguish between TGM-generated text and human-written ones play an important role in preventing abuse of TGM.
We describe our pipeline for the two DIALOG-22 RuATD tasks: detecting generated text (binary task) and classification of which model was used to generate text.
arXiv Detail & Related papers (2022-06-16T09:33:26Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z) - Unsupervised Bitext Mining and Translation via Self-trained Contextual
Embeddings [51.47607125262885]
We describe an unsupervised method to create pseudo-parallel corpora for machine translation (MT) from unaligned text.
We use multilingual BERT to create source and target sentence embeddings for nearest-neighbor search and adapt the model via self-training.
We validate our technique by extracting parallel sentence pairs on the BUCC 2017 bitext mining task and observe up to a 24.5 point increase (absolute) in F1 scores over previous unsupervised methods.
arXiv Detail & Related papers (2020-10-15T14:04:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.