Can professional translators identify machine-generated text?
- URL: http://arxiv.org/abs/2601.15828v2
- Date: Tue, 27 Jan 2026 07:23:21 GMT
- Title: Can professional translators identify machine-generated text?
- Authors: Michael Farrell,
- Abstract summary: This study investigates whether professional translators can reliably identify short stories generated in Italian by artificial intelligence (AI) without prior specialized training.<n>Sixty-nine translators took part in an in-person experiment, where they assessed three anonymized short stories.<n>Low burstiness and narrative contradiction emerged as the most reliable indicators of synthetic authorship.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This study investigates whether professional translators can reliably identify short stories generated in Italian by artificial intelligence (AI) without prior specialized training. Sixty-nine translators took part in an in-person experiment, where they assessed three anonymized short stories - two written by ChatGPT-4o and one by a human author. For each story, participants rated the likelihood of AI authorship and provided justifications for their choices. While average results were inconclusive, a statistically significant subset (16.2%) successfully distinguished the synthetic texts from the human text, suggesting that their judgements were informed by analytical skill rather than chance. However, a nearly equal number misclassified the texts in the opposite direction, often relying on subjective impressions rather than objective markers, possibly reflecting a reader preference for AI-generated texts. Low burstiness and narrative contradiction emerged as the most reliable indicators of synthetic authorship, with unexpected calques, semantic loans and syntactic transfer from English also reported. In contrast, features such as grammatical accuracy and emotional tone frequently led to misclassification. These findings raise questions about the role and scope of synthetic-text editing in professional contexts.
Related papers
- Do readers prefer AI-generated Italian short stories? [0.0]
This study investigates whether readers prefer AI-generated short stories in Italian over one written by a renowned Italian author.<n>In a blind setup, 20 participants read and evaluated three stories, two created with ChatGPT-4o and one by Alberto Moravia.<n>The results showed that the AI-written texts received slightly higher average ratings and were more frequently preferred, although differences were modest.
arXiv Detail & Related papers (2026-01-24T08:15:13Z) - COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes [83.84578306665976]
Large language models exhibit systematic deficiencies in creative writing, particularly in non-English contexts.<n>We present COIG-Writer, a novel Chinese creative writing dataset that captures both diverse outputs and their underlying thought processes.
arXiv Detail & Related papers (2025-10-16T15:01:19Z) - Liaozhai through the Looking-Glass: On Paratextual Explicitation of Culture-Bound Terms in Machine Translation [70.43884512651668]
We formalize Genette's (1987) theory of paratexts from literary and translation studies to introduce the task of paratextual explicitation for machine translation.<n>We construct a dataset of 560 expert-aligned paratexts from four English translations of the classical Chinese short story collection Liaozhai.<n>Our findings demonstrate the potential of paratextual explicitation in advancing machine translation beyond linguistic equivalence.
arXiv Detail & Related papers (2025-09-27T16:27:36Z) - The Reader is the Metric: How Textual Features and Reader Profiles Explain Conflicting Evaluations of AI Creative Writing [1.3654846342364306]
We use five public datasets (1,471 stories, 101 annotators including critics, students, and lay readers) to extract 17 reference-less textual features.<n>We model individual reader preferences, deriving feature importance vectors that reflect their textual priorities.<n>Our results quantitatively explain how measurements of literary quality are a function of how text features align with each reader's preferences.
arXiv Detail & Related papers (2025-06-03T18:50:22Z) - Humans can learn to detect AI-generated texts, or at least learn when they can't [0.0]
This study investigates whether individuals can learn to accurately discriminate between human-written and AI-produced texts when provided with immediate feedback.<n>We used GPT-4o to generate several hundred texts across various genres and text types.<n>We presented randomized text pairs to 254 Czech native speakers who identified which text was human-written and which was AI-generated.
arXiv Detail & Related papers (2025-05-03T17:42:49Z) - Can postgraduate translation students identify machine-generated text? [0.0]
This study explores the ability of linguistically trained individuals to discern machine-generated output from human-written text.<n>Twenty-three postgraduate translation students analysed excerpts of Italian prose and assigned likelihood scores to indicate whether they believed they were human-written or AI-generated.
arXiv Detail & Related papers (2025-04-12T09:58:09Z) - Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering [68.3400058037817]
We introduce TREQA (Translation Evaluation via Question-Answering), a framework that extrinsically evaluates translation quality.<n>We show that TREQA is competitive with and, in some cases, outperforms state-of-the-art neural and LLM-based metrics in ranking alternative paragraph-level translations.
arXiv Detail & Related papers (2025-04-10T09:24:54Z) - ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability [62.285407189502216]
Detecting texts generated by Large Language Models (LLMs) could cause grave mistakes due to incorrect decisions.<n>We introduce ExaGPT, an interpretable detection approach grounded in the human decision-making process.<n>We show that ExaGPT massively outperforms prior powerful detectors by up to +40.9 points of accuracy at a false positive rate of 1%.
arXiv Detail & Related papers (2025-02-17T01:15:07Z) - People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text [37.36534911201806]
We hire annotators to read 300 non-fiction English articles and label them as either human-written or AI-generated.<n>Experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text.<n>We release our annotated dataset and code to spur future research into both human and automated detection of AI-generated text.
arXiv Detail & Related papers (2025-01-26T19:31:34Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Curious Case of Language Generation Evaluation Metrics: A Cautionary
Tale [52.663117551150954]
A few popular metrics remain as the de facto metrics to evaluate tasks such as image captioning and machine translation.
This is partly due to ease of use, and partly because researchers expect to see them and know how to interpret them.
In this paper, we urge the community for more careful consideration of how they automatically evaluate their models.
arXiv Detail & Related papers (2020-10-26T13:57:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.