Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text
Detection
- URL: http://arxiv.org/abs/2310.18906v1
- Date: Sun, 29 Oct 2023 05:28:44 GMT
- Title: Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text
Detection
- Authors: Duke Nguyen, Khaing Myat Noe Naing, Aditya Joshi
- Abstract summary: We use a stacking ensemble of Transformers for the task of AI-generated text detection.
We show that ensembling the models results in an improved accuracy in comparison with using them individually.
- Score: 3.2047868962340327
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper reports our submission under the team name `SynthDetectives' to
the ALTA 2023 Shared Task. We use a stacking ensemble of Transformers for the
task of AI-generated text detection. Our approach is novel in terms of its
choice of models in that we use accessible and lightweight models in the
ensemble. We show that ensembling the models results in an improved accuracy in
comparison with using them individually. Our approach achieves an accuracy
score of 0.9555 on the official test data provided by the shared task
organisers.
Related papers
- LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text across English and Multilingual Contexts [0.8495482945981923]
This paper presents a system developed for Task 1 of the COLING 2025 Workshop on Detecting AI-Generated Content.
Our approach utilizes an ensemble of models, with weights assigned according to each model's inverse perplexity, to enhance classification accuracy.
Our results demonstrate the effectiveness of inverse perplexity weighting in improving the robustness of machine-generated text detection across both monolingual and multilingual settings.
arXiv Detail & Related papers (2025-01-21T06:32:32Z) - Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration [81.45763823762682]
This work aims to bridge the gap by investigating the problem of data synthesis through multi-agent sampling.
We introduce Tree Search-based Orchestrated Agents(TOA), where the workflow evolves iteratively during the sequential sampling process.
Our experiments on alignment, machine translation, and mathematical reasoning demonstrate that multi-agent sampling significantly outperforms single-agent sampling as inference compute scales.
arXiv Detail & Related papers (2024-12-22T15:16:44Z) - AISPACE at SemEval-2024 task 8: A Class-balanced Soft-voting System for Detecting Multi-generator Machine-generated Text [0.0]
SemEval-2024 Task 8 provides a challenge to detect human-written and machine-generated text.
This paper proposes a system that mainly deals with Subtask B.
It aims to detect if given full text is written by human or is generated by a specific Large Language Model (LLM), which is actually a multi-class text classification task.
arXiv Detail & Related papers (2024-04-01T06:25:47Z) - Contrastive Transformer Learning with Proximity Data Generation for
Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery.
Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data.
In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z) - Improving Cross-task Generalization of Unified Table-to-text Models with
Compositional Task Configurations [63.04466647849211]
Methods typically encode task information with a simple dataset name as a prefix to the encoder.
We propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization.
We show this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations.
arXiv Detail & Related papers (2022-12-17T02:20:14Z) - Detecting Generated Scientific Papers using an Ensemble of Transformer
Models [4.56877715768796]
The paper describes neural models developed for the DAGPap22 shared task hosted at the Third Workshop on Scholarly Document Processing.
Our work focuses on comparing different transformer-based models as well as using additional datasets and techniques to deal with imbalanced classes.
arXiv Detail & Related papers (2022-09-17T08:43:25Z) - Self-Supervised Object Detection via Generative Image Synthesis [106.65384648377349]
We present the first end-to-end analysis-by synthesis framework with controllable GANs for the task of self-supervised object detection.
We use collections of real world images without bounding box annotations to learn to synthesize and detect objects.
Our work advances the field of self-supervised object detection by introducing a successful new paradigm of using controllable GAN-based image synthesis for it.
arXiv Detail & Related papers (2021-10-19T11:04:05Z) - Inducing Transformer's Compositional Generalization Ability via
Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions.
Existing neural models have been shown to lack this basic ability in learning symbolic structures.
We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z) - Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks.
We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z) - Gradient-Based Adversarial Training on Transformer Networks for
Detecting Check-Worthy Factual Claims [3.7543966923106438]
We introduce the first adversarially-regularized, transformer-based claim spotter model.
We obtain a 4.70 point F1-score improvement over current state-of-the-art models.
We propose a method to apply adversarial training to transformer models.
arXiv Detail & Related papers (2020-02-18T16:51:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.