Performance Comparison of Large Language Models on VNHSGE English
Dataset: OpenAI ChatGPT, Microsoft Bing Chat, and Google Bard
- URL: http://arxiv.org/abs/2307.02288v3
- Date: Thu, 20 Jul 2023 01:13:27 GMT
- Title: Performance Comparison of Large Language Models on VNHSGE English
Dataset: OpenAI ChatGPT, Microsoft Bing Chat, and Google Bard
- Authors: Xuan-Quy Dao
- Abstract summary: Three large language models (LLMs) were compared on the VNHSGE English dataset.
The results show that BingChat is better than ChatGPT and Bard.
BingChat, Bard and ChatGPT outperform Vietnamese students in English language proficiency.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper presents a performance comparison of three large language models
(LLMs), namely OpenAI ChatGPT, Microsoft Bing Chat (BingChat), and Google Bard,
on the VNHSGE English dataset. The performance of BingChat, Bard, and ChatGPT
(GPT-3.5) is 92.4\%, 86\%, and 79.2\%, respectively. The results show that
BingChat is better than ChatGPT and Bard. Therefore, BingChat and Bard can
replace ChatGPT while ChatGPT is not yet officially available in Vietnam. The
results also indicate that BingChat, Bard and ChatGPT outperform Vietnamese
students in English language proficiency. The findings of this study contribute
to the understanding of the potential of LLMs in English language education.
The remarkable performance of ChatGPT, BingChat, and Bard demonstrates their
potential as effective tools for teaching and learning English at the high
school level.
Related papers
- "ChatGPT, a Friend or Foe for Education?" Analyzing the User's
Perspectives on the Latest AI Chatbot Via Reddit [0.0]
This study has analyzed 247 Reddit top posts related to the educational use of ChatGPT.
Results show that the majority of the users took a neutral viewpoint.
There was more positive perception than negative regarding the usefulness of ChatGPT in education.
arXiv Detail & Related papers (2023-09-27T23:59:44Z) - ChatGPT is Good but Bing Chat is Better for Vietnamese Students [0.0]
This study examines the efficacy of two SOTA large language models (LLMs), namely ChatGPT and Microsoft Bing Chat (BingChat), in catering to the needs of Vietnamese students.
We conduct a comparative analysis of their academic achievements in various disciplines, encompassing mathematics, literature, English language, physics, chemistry, biology, history, geography, and civic education.
The results of our study suggest that BingChat demonstrates superior performance compared to ChatGPT across a wide range of subjects, with the exception of literature, where ChatGPT exhibits better performance.
arXiv Detail & Related papers (2023-07-17T06:36:53Z) - Phoenix: Democratizing ChatGPT across Languages [68.75163236421352]
We release a large language model "Phoenix", achieving competitive performance among open-source English and Chinese models.
We believe this work will be beneficial to make ChatGPT more accessible, especially in countries where people cannot use ChatGPT due to restrictions from OpenAI or local goverments.
arXiv Detail & Related papers (2023-04-20T16:50:04Z) - ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large
Language Models in Multilingual Learning [70.57126720079971]
Large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP)
This paper evaluates ChatGPT on 7 different tasks, covering 37 diverse languages with high, medium, low, and extremely low resources.
Compared to the performance of previous models, our extensive experimental results demonstrate a worse performance of ChatGPT for different NLP tasks and languages.
arXiv Detail & Related papers (2023-04-12T05:08:52Z) - Towards Making the Most of ChatGPT for Machine Translation [75.576405098545]
ChatGPT shows remarkable capabilities for machine translation (MT)
Several prior studies have shown that it achieves comparable results to commercial systems for high-resource languages.
arXiv Detail & Related papers (2023-03-24T03:35:21Z) - Seeing ChatGPT Through Students' Eyes: An Analysis of TikTok Data [3.441021278275805]
We analyzed the content of the 100 most popular videos in English tagged with #chatgpt, which collectively garnered over 250 million views.
Most of the videos promoted the use of ChatGPT for tasks like writing essays or code.
What is, however, missing from the analyzed clips on TikTok are videos that discuss ChatGPT producing content that is nonsensical or unfaithful to the training data.
arXiv Detail & Related papers (2023-03-09T15:46:54Z) - Can ChatGPT Understand Too? A Comparative Study on ChatGPT and
Fine-tuned BERT [103.57103957631067]
ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries.
We evaluate ChatGPT's understanding ability by evaluating it on the most popular GLUE benchmark, and comparing it with 4 representative fine-tuned BERT-style models.
We find that: 1) ChatGPT falls short in handling paraphrase and similarity tasks; 2) ChatGPT outperforms all BERT models on inference tasks by a large margin; 3) ChatGPT achieves comparable performance compared with BERT on sentiment analysis and question answering tasks.
arXiv Detail & Related papers (2023-02-19T12:29:33Z) - Is ChatGPT a General-Purpose Natural Language Processing Task Solver? [113.22611481694825]
Large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot.
Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community.
It is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot.
arXiv Detail & Related papers (2023-02-08T09:44:51Z) - Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine [97.8609714773255]
We evaluate ChatGPT for machine translation, including translation prompt, multilingual translation, and translation robustness.
ChatGPT performs competitively with commercial translation products but lags behind significantly on low-resource or distant languages.
With the launch of the GPT-4 engine, the translation performance of ChatGPT is significantly boosted.
arXiv Detail & Related papers (2023-01-20T08:51:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.