Do androids dream of fictional references? A bibliographic dialogue with
ChatGPT3.5
- URL: http://arxiv.org/abs/2312.00789v1
- Date: Mon, 4 Sep 2023 08:11:59 GMT
- Title: Do androids dream of fictional references? A bibliographic dialogue with
ChatGPT3.5
- Authors: Olivier Las Vergnas (AFA, CIREL)
- Abstract summary: This article focuses on references generated by the ChatGPT3.5 tool.
We explored six different themes and analyzed a sample of references generated by the model, in French and English.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This article focuses on bibliographic references generated by the ChatGPT3.5
tool. Using this tool based on the trained GPT generation model ChatGPT3.5,
developed by the company OpenAI, we explored six different themes and analyzed
a sample of references generated by the model, in French and English. The
results revealed high percentages of fictitious references in several fields,
underlining the importance of carefully checking these references before using
them in research work. An improvement in results was nevertheless noted between
May and July with regard to English references for themes on which ChatGPR3.5
has been particularly trained, but the situation remains unsatisfactory in
French, for example. It should also be pointed out that much of the text in
this article was generated by ChatGPT in a joint effort with the human author.
Related papers
- Is ChatGPT the Future of Causal Text Mining? A Comprehensive Evaluation
and Analysis [8.031131164056347]
This study conducts comprehensive evaluations of ChatGPT's causal text mining capabilities.
We introduce a benchmark that extends beyond general English datasets.
We also provide an evaluation framework to ensure fair comparisons between ChatGPT and previous approaches.
arXiv Detail & Related papers (2024-02-22T12:19:04Z) - What has ChatGPT read? The origins of archaeological citations used by a
generative artificial intelligence application [0.0]
This paper tested what archaeological literature appears to have been included in ChatGPT's training phase.
While ChatGPT offered seemingly pertinent references, a large percentage proved to be fictitious.
It can be shown that all references provided by ChatGPT that were found to be genuine have also been cited on Wikipedia pages.
arXiv Detail & Related papers (2023-08-07T05:06:35Z) - Is ChatGPT Involved in Texts? Measure the Polish Ratio to Detect
ChatGPT-Generated Text [48.36706154871577]
We introduce a novel dataset termed HPPT (ChatGPT-polished academic abstracts)
It diverges from extant corpora by comprising pairs of human-written and ChatGPT-polished abstracts instead of purely ChatGPT-generated texts.
We also propose the "Polish Ratio" method, an innovative measure of the degree of modification made by ChatGPT compared to the original human-written text.
arXiv Detail & Related papers (2023-07-21T06:38:37Z) - Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT Models [6.145834902689888]
Large language models (LLMs) have demonstrated impressive performance on various downstream tasks without requiring fine-tuning.
Despite having a lower training proportion compared to English, these models also exhibit remarkable capabilities in other languages.
In this study, we assess the performance of GPT-3.5 and GPT-4 models on seven distinct Arabic NLP tasks.
arXiv Detail & Related papers (2023-06-28T15:54:29Z) - Is ChatGPT A Good Keyphrase Generator? A Preliminary Study [51.863368917344864]
ChatGPT has recently garnered significant attention from the computational linguistics community.
We evaluate its performance in various aspects, including keyphrase generation prompts, keyphrase generation diversity, and long document understanding.
We find that ChatGPT performs exceptionally well on all six candidate prompts, with minor performance differences observed across the datasets.
arXiv Detail & Related papers (2023-03-23T02:50:38Z) - Is ChatGPT a Good NLG Evaluator? A Preliminary Study [121.77986688862302]
We provide a preliminary meta-evaluation on ChatGPT to show its reliability as an NLG metric.
Experimental results show that compared with previous automatic metrics, ChatGPT achieves state-of-the-art or competitive correlation with human judgments.
We hope our preliminary study could prompt the emergence of a general-purposed reliable NLG metric.
arXiv Detail & Related papers (2023-03-07T16:57:20Z) - Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine [97.8609714773255]
We evaluate ChatGPT for machine translation, including translation prompt, multilingual translation, and translation robustness.
ChatGPT performs competitively with commercial translation products but lags behind significantly on low-resource or distant languages.
With the launch of the GPT-4 engine, the translation performance of ChatGPT is significantly boosted.
arXiv Detail & Related papers (2023-01-20T08:51:36Z) - Elaboration-Generating Commonsense Question Answering at Scale [77.96137534751445]
In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge.
We finetune smaller language models to generate useful intermediate context, referred to here as elaborations.
Our framework alternates between updating two language models -- an elaboration generator and an answer predictor -- allowing each to influence the other.
arXiv Detail & Related papers (2022-09-02T18:32:09Z) - CoAuthor: Designing a Human-AI Collaborative Writing Dataset for
Exploring Language Model Capabilities [92.79451009324268]
We present CoAuthor, a dataset designed for revealing GPT-3's capabilities in assisting creative and argumentative writing.
We demonstrate that CoAuthor can address questions about GPT-3's language, ideation, and collaboration capabilities.
We discuss how this work may facilitate a more principled discussion around LMs' promises and pitfalls in relation to interaction design.
arXiv Detail & Related papers (2022-01-18T07:51:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.