Fugu-MT 論文翻訳(概要): Assessing Classical Machine Learning and Transformer-based Approaches for Detecting AI-Generated Research Text

論文の概要: Assessing Classical Machine Learning and Transformer-based Approaches for Detecting AI-Generated Research Text

arxiv url: http://arxiv.org/abs/2509.20375v1
Date: Sat, 20 Sep 2025 04:36:21 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-26 20:58:12.474736
Title: Assessing Classical Machine Learning and Transformer-based Approaches for Detecting AI-Generated Research Text
Title（参考訳）: AI生成研究テキスト検出のための古典的機械学習と変圧器によるアプローチの評価
Authors: Sharanya Parimanoharan, Ruwan D. Nawarathna,
Abstract要約: 機械学習アプローチは、ChatGPT-3.5生成したテキストと人間のテキストを区別することができる。 DistilBERTは全体的な最高のパフォーマンスを達成し、Logistic RegressionとBERT-Customはしっかりとしたバランスの取れた代替手段を提供する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid adoption of large language models (LLMs) such as ChatGPT has blurred the line between human and AI-generated texts, raising urgent questions about academic integrity, intellectual property, and the spread of misinformation. Thus, reliable AI-text detection is needed for fair assessment to safeguard human authenticity and cultivate trust in digital communication. In this study, we investigate how well current machine learning (ML) approaches can distinguish ChatGPT-3.5-generated texts from human-written texts employing a labeled data set of 250 pairs of abstracts from a wide range of research topics. We test and compare both classical (Logistic Regression armed with classical Bag-of-Words, POS, and TF-IDF features) and transformer-based (BERT augmented with N-grams, DistilBERT, BERT with a lightweight custom classifier, and LSTM-based N-gram models) ML detection techniques. As we aim to assess each model's performance in detecting AI-generated research texts, we also aim to test whether an ensemble of these models can outperform any single detector. Results show DistilBERT achieves the overall best performance, while Logistic Regression and BERT-Custom offer solid, balanced alternatives; LSTM- and BERT-N-gram approaches lag. The max voting ensemble of the three best models fails to surpass DistilBERT itself, highlighting the primacy of a single transformer-based representation over mere model diversity. By comprehensively assessing the strengths and weaknesses of these AI-text detection approaches, this work lays a foundation for more robust transformer frameworks with larger, richer datasets to keep pace with ever-improving generative AI models.
Abstract（参考訳）: ChatGPTのような大規模言語モデル(LLM)の急速な採用により、人間とAIが生成するテキストの境界が曖昧になり、学術的完全性、知的財産権、誤情報の普及に関する緊急の疑問が提起された。したがって、人間の信頼を守り、デジタルコミュニケーションへの信頼を育むために、公正な評価のために信頼できるAIテキスト検出が必要である。そこで本研究では,ChatGPT-3.5生成テキストと,250対の抽象文からなるラベル付きデータセットを多種多様な研究トピックから用いた人文テキストとを,現在の機械学習(ML)アプローチがいかに区別できるかを検討する。 N-grams, DistilBERT, BERT with a lightweight custom classifier, LSTM-based N-gram model) ML検出技術とトランスフォーマーベース(BERT augmented with N-grams, DistilBERT, BERT with a lightweight custom classifier, LSTM-based N-gram model)の比較を行った。 AI生成された研究用テキストの検出における各モデルの性能を評価することを目的としており、これらのモデルのアンサンブルが単一の検出器よりも優れているかどうかをテストすることを目的とする。その結果、DistilBERTは全体的な最高のパフォーマンスを達成し、Logistic RegressionとBERT-Customは安定した代替手段を提供し、LSTMとBERT-N-gramはラグにアプローチした。 3つのベストモデルの最大投票アンサンブルは、DistilBERT自体を超えず、単にモデルの多様性よりも単一のトランスフォーマーベースの表現の優位性を強調している。これらのAIテキスト検出アプローチの長所と短所を包括的に評価することにより、この研究は、より堅牢なトランスフォーマーフレームワークと、より大きく、よりリッチなデータセットにより、継続的に改善される生成AIモデルにペースを維持するための基盤となる。

論文の概要: Assessing Classical Machine Learning and Transformer-based Approaches for Detecting AI-Generated Research Text

関連論文リスト