How many words does ChatGPT know? The answer is ChatWords
- URL: http://arxiv.org/abs/2309.16777v1
- Date: Thu, 28 Sep 2023 18:13:02 GMT
- Title: How many words does ChatGPT know? The answer is ChatWords
- Authors: Gonzalo Mart\'inez, Javier Conde, Pedro Reviriego, Elena
Merino-G\'omez, Jos\'e Alberto Hern\'andez, Fabrizio Lombardi
- Abstract summary: evaluating the performance of ChatGPT and similar AI tools is a complex issue that is being explored from different perspectives.
We contribute to those efforts with ChatWords, an automated test system to evaluate ChatGPT knowledge of an arbitrary set of words.
Results show that ChatGPT is only able to recognize approximately 80% of the words in the dictionary and 90% of the words in the Quixote, in some cases with an incorrect meaning.
- Score: 5.906689377130112
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The introduction of ChatGPT has put Artificial Intelligence (AI) Natural
Language Processing (NLP) in the spotlight. ChatGPT adoption has been
exponential with millions of users experimenting with it in a myriad of tasks
and application domains with impressive results. However, ChatGPT has
limitations and suffers hallucinations, for example producing answers that look
plausible but they are completely wrong. Evaluating the performance of ChatGPT
and similar AI tools is a complex issue that is being explored from different
perspectives. In this work, we contribute to those efforts with ChatWords, an
automated test system, to evaluate ChatGPT knowledge of an arbitrary set of
words. ChatWords is designed to be extensible, easy to use, and adaptable to
evaluate also other NLP AI tools. ChatWords is publicly available and its main
goal is to facilitate research on the lexical knowledge of AI tools. The
benefits of ChatWords are illustrated with two case studies: evaluating the
knowledge that ChatGPT has of the Spanish lexicon (taken from the official
dictionary of the "Real Academia Espa\~nola") and of the words that appear in
the Quixote, the well-known novel written by Miguel de Cervantes. The results
show that ChatGPT is only able to recognize approximately 80% of the words in
the dictionary and 90% of the words in the Quixote, in some cases with an
incorrect meaning. The implications of the lexical knowledge of NLP AI tools
and potential applications of ChatWords are also discussed providing directions
for further work on the study of the lexical knowledge of AI tools.
Related papers
- Exploring ChatGPT's Capabilities on Vulnerability Management [56.4403395100589]
We explore ChatGPT's capabilities on 6 tasks involving the complete vulnerability management process with a large-scale dataset containing 70,346 samples.
One notable example is ChatGPT's proficiency in tasks like generating titles for software bug reports.
Our findings reveal the difficulties encountered by ChatGPT and shed light on promising future directions.
arXiv Detail & Related papers (2023-11-11T11:01:13Z) - Chatbot-supported Thesis Writing: An Autoethnographic Report [0.0]
ChatGPT might be applied to formats that require learners to generate text, such as bachelor theses or student research papers.
ChatGPT is to be valued as a beneficial tool in thesis writing.
However, writing a conclusive thesis still requires the learner's meaningful engagement.
arXiv Detail & Related papers (2023-10-14T09:09:26Z) - Playing with Words: Comparing the Vocabulary and Lexical Richness of
ChatGPT and Humans [3.0059120458540383]
generative language models such as ChatGPT have triggered a revolution that can transform how text is generated.
Will the use of tools such as ChatGPT increase or reduce the vocabulary used or the lexical richness?
This has implications for words, as those not included in AI-generated content will tend to be less and less popular and may eventually be lost.
arXiv Detail & Related papers (2023-08-14T21:19:44Z) - ChatGPT: A Study on its Utility for Ubiquitous Software Engineering
Tasks [2.084078990567849]
ChatGPT (Chat Generative Pre-trained Transformer) launched by OpenAI on November 30, 2022.
In this study, we explore how ChatGPT can be used to help with common software engineering tasks.
arXiv Detail & Related papers (2023-05-26T11:29:06Z) - Can ChatGPT Pass An Introductory Level Functional Language Programming
Course? [2.3456295046913405]
This paper aims to explore how well ChatGPT can perform in an introductory-level functional language programming course.
Our comprehensive evaluation provides valuable insights into ChatGPT's impact from both student and instructor perspectives.
arXiv Detail & Related papers (2023-04-29T20:30:32Z) - ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large
Language Models in Multilingual Learning [70.57126720079971]
Large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP)
This paper evaluates ChatGPT on 7 different tasks, covering 37 diverse languages with high, medium, low, and extremely low resources.
Compared to the performance of previous models, our extensive experimental results demonstrate a worse performance of ChatGPT for different NLP tasks and languages.
arXiv Detail & Related papers (2023-04-12T05:08:52Z) - To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection.
We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains.
Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z) - ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models [49.52083248451775]
Large language models (LLMs) have made significant progress in NLP.
We specifically focus on ChatGPT, a widely used and easily accessible LLM.
We conduct a series of experiments on 11 datasets to evaluate ChatGPT's commonsense abilities.
arXiv Detail & Related papers (2023-03-29T03:05:43Z) - A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on
Reasoning, Hallucination, and Interactivity [79.12003701981092]
We carry out an extensive technical evaluation of ChatGPT using 23 data sets covering 8 different common NLP application tasks.
We evaluate the multitask, multilingual and multi-modal aspects of ChatGPT based on these data sets and a newly designed multimodal dataset.
ChatGPT is 63.41% accurate on average in 10 different reasoning categories under logical reasoning, non-textual reasoning, and commonsense reasoning.
arXiv Detail & Related papers (2023-02-08T12:35:34Z) - Is ChatGPT a General-Purpose Natural Language Processing Task Solver? [113.22611481694825]
Large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot.
Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community.
It is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot.
arXiv Detail & Related papers (2023-02-08T09:44:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.