ChatGPT vs Human-authored Text: Insights into Controllable Text
Summarization and Sentence Style Transfer
- URL: http://arxiv.org/abs/2306.07799v1
- Date: Tue, 13 Jun 2023 14:21:35 GMT
- Title: ChatGPT vs Human-authored Text: Insights into Controllable Text
Summarization and Sentence Style Transfer
- Authors: Dongqi Pu, Vera Demberg
- Abstract summary: We conduct a systematic inspection of ChatGPT's performance in two controllable generation tasks.
We evaluate the faithfulness of the generated text, and compare the model's performance with human-authored texts.
We observe that ChatGPT sometimes incorporates factual errors or hallucinations when adapting the text to suit a specific style.
- Score: 8.64514166615844
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale language models, like ChatGPT, have garnered significant media
attention and stunned the public with their remarkable capacity for generating
coherent text from short natural language prompts. In this paper, we aim to
conduct a systematic inspection of ChatGPT's performance in two controllable
generation tasks, with respect to ChatGPT's ability to adapt its output to
different target audiences (expert vs. layman) and writing styles (formal vs.
informal). Additionally, we evaluate the faithfulness of the generated text,
and compare the model's performance with human-authored texts. Our findings
indicate that the stylistic variations produced by humans are considerably
larger than those demonstrated by ChatGPT, and the generated texts diverge from
human samples in several characteristics, such as the distribution of word
types. Moreover, we observe that ChatGPT sometimes incorporates factual errors
or hallucinations when adapting the text to suit a specific style.
Related papers
- DEMASQ: Unmasking the ChatGPT Wordsmith [63.8746084667206]
We propose an effective ChatGPT detector named DEMASQ, which accurately identifies ChatGPT-generated content.
Our method addresses two critical factors: (i) the distinct biases in text composition observed in human- and machine-generated content and (ii) the alterations made by humans to evade previous detection methods.
arXiv Detail & Related papers (2023-11-08T21:13:05Z) - Detecting ChatGPT: A Survey of the State of Detecting ChatGPT-Generated
Text [1.9643748953805937]
generative language models can potentially deceive by generating artificial text that appears to be human-generated.
This survey provides an overview of the current approaches employed to differentiate between texts generated by humans and ChatGPT.
arXiv Detail & Related papers (2023-09-14T13:05:20Z) - Is ChatGPT Involved in Texts? Measure the Polish Ratio to Detect
ChatGPT-Generated Text [48.36706154871577]
We introduce a novel dataset termed HPPT (ChatGPT-polished academic abstracts)
It diverges from extant corpora by comprising pairs of human-written and ChatGPT-polished abstracts instead of purely ChatGPT-generated texts.
We also propose the "Polish Ratio" method, an innovative measure of the degree of modification made by ChatGPT compared to the original human-written text.
arXiv Detail & Related papers (2023-07-21T06:38:37Z) - GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [27.901155229342375]
We present a novel approach for detecting ChatGPT-generated vs. human-written text using language models.
Our models achieved remarkable results, with an accuracy of over 97% on the test dataset, as evaluated through various metrics.
arXiv Detail & Related papers (2023-05-13T17:12:11Z) - AI, write an essay for me: A large-scale comparison of human-written
versus ChatGPT-generated essays [66.36541161082856]
ChatGPT and similar generative AI models have attracted hundreds of millions of users.
This study compares human-written versus ChatGPT-generated argumentative student essays.
arXiv Detail & Related papers (2023-04-24T12:58:28Z) - ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large
Language Models in Multilingual Learning [70.57126720079971]
Large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP)
This paper evaluates ChatGPT on 7 different tasks, covering 37 diverse languages with high, medium, low, and extremely low resources.
Compared to the performance of previous models, our extensive experimental results demonstrate a worse performance of ChatGPT for different NLP tasks and languages.
arXiv Detail & Related papers (2023-04-12T05:08:52Z) - To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection.
We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains.
Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z) - Comparing Abstractive Summaries Generated by ChatGPT to Real Summaries
Through Blinded Reviewers and Text Classification Algorithms [0.8339831319589133]
ChatGPT, developed by OpenAI, is a recent addition to the family of language models.
We evaluate the performance of ChatGPT on Abstractive Summarization by the means of automated metrics and blinded human reviewers.
arXiv Detail & Related papers (2023-03-30T18:28:33Z) - Can ChatGPT Understand Too? A Comparative Study on ChatGPT and
Fine-tuned BERT [103.57103957631067]
ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries.
We evaluate ChatGPT's understanding ability by evaluating it on the most popular GLUE benchmark, and comparing it with 4 representative fine-tuned BERT-style models.
We find that: 1) ChatGPT falls short in handling paraphrase and similarity tasks; 2) ChatGPT outperforms all BERT models on inference tasks by a large margin; 3) ChatGPT achieves comparable performance compared with BERT on sentiment analysis and question answering tasks.
arXiv Detail & Related papers (2023-02-19T12:29:33Z) - ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine
Learning Model for Detecting Short ChatGPT-generated Text [2.0378492681344493]
We study whether a machine learning model can be effectively trained to accurately distinguish between original human and seemingly human (that is, ChatGPT-generated) text.
We employ an explainable artificial intelligence framework to gain insight into the reasoning behind the model trained to differentiate between ChatGPT-generated and human-generated text.
Our study focuses on short online reviews, conducting two experiments comparing human-generated and ChatGPT-generated text.
arXiv Detail & Related papers (2023-01-30T08:06:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.