Whose LLM is it Anyway? Linguistic Comparison and LLM Attribution for
GPT-3.5, GPT-4 and Bard
- URL: http://arxiv.org/abs/2402.14533v1
- Date: Thu, 22 Feb 2024 13:25:17 GMT
- Title: Whose LLM is it Anyway? Linguistic Comparison and LLM Attribution for
GPT-3.5, GPT-4 and Bard
- Authors: Ariel Rosenfeld, Teddy Lazebnik
- Abstract summary: Large Language Models (LLMs) are capable of generating text that is similar to or surpasses human quality.
We compare the vocabulary, Part-Of-Speech (POS) distribution, dependency distribution, and sentiment of texts generated by three of the most popular LLMs to diverse inputs.
The results point to significant linguistic variations which, in turn, enable us to attribute a given text to its LLM origin with a favorable 88% accuracy.
- Score: 3.419330841031544
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large Language Models (LLMs) are capable of generating text that is similar
to or surpasses human quality. However, it is unclear whether LLMs tend to
exhibit distinctive linguistic styles akin to how human authors do. Through a
comprehensive linguistic analysis, we compare the vocabulary, Part-Of-Speech
(POS) distribution, dependency distribution, and sentiment of texts generated
by three of the most popular LLMS today (GPT-3.5, GPT-4, and Bard) to diverse
inputs. The results point to significant linguistic variations which, in turn,
enable us to attribute a given text to its LLM origin with a favorable 88\%
accuracy using a simple off-the-shelf classification model. Theoretical and
practical implications of this intriguing finding are discussed.
Related papers
- Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models [15.857451401890092]
We quantify and gain insight into the linguistic knowledge captured by Large Language Models (LLMs)
Our large-scale experiments, spanning 100+ LLMs and 150k minimal pairs in three languages, reveal properties of linguistic similarity from four key aspects.
arXiv Detail & Related papers (2024-09-19T03:29:40Z) - PhonologyBench: Evaluating Phonological Skills of Large Language Models [57.80997670335227]
Phonology, the study of speech's structure and pronunciation rules, is a critical yet often overlooked component in Large Language Model (LLM) research.
We present PhonologyBench, a novel benchmark consisting of three diagnostic tasks designed to explicitly test the phonological skills of LLMs.
We observe a significant gap of 17% and 45% on Rhyme Word Generation and Syllable counting, respectively, when compared to humans.
arXiv Detail & Related papers (2024-04-03T04:53:14Z) - OMGEval: An Open Multilingual Generative Evaluation Benchmark for Large
Language Models [59.54423478596468]
We introduce OMGEval, the first Open-source Multilingual Generative test set that can assess the capability of LLMs in different languages.
For each language, OMGEval provides 804 open-ended questions, covering a wide range of important capabilities of LLMs.
Specifically, the current version of OMGEval includes 5 languages (i.e., Zh, Ru, Fr, Es, Ar)
arXiv Detail & Related papers (2024-02-21T04:42:41Z) - Beware of Words: Evaluating the Lexical Diversity of Conversational LLMs using ChatGPT as Case Study [3.0059120458540383]
We consider the evaluation of the lexical richness of the text generated by conversational Large Language Models (LLMs) and how it depends on the model parameters.
The results show how lexical richness depends on the version of ChatGPT and some of its parameters, such as the presence penalty, or on the role assigned to the model.
arXiv Detail & Related papers (2024-02-11T13:41:17Z) - Native Language Identification with Large Language Models [60.80452362519818]
We show that GPT models are proficient at NLI classification, with GPT-4 setting a new performance record of 91.7% on the benchmark11 test set in a zero-shot setting.
We also show that unlike previous fully-supervised settings, LLMs can perform NLI without being limited to a set of known classes.
arXiv Detail & Related papers (2023-12-13T00:52:15Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z) - Can LLMs Capture Human Preferences? [5.683832910692926]
We explore the viability of Large Language Models (LLMs) in emulating human survey respondents and eliciting preferences.
We compare responses from LLMs across various languages and compare them to human responses, exploring preferences between smaller, sooner, and larger, later rewards.
Our findings reveal that both GPT models demonstrate less patience than humans, with GPT-3.5 exhibiting a lexicographic preference for earlier rewards, unlike human decision-makers.
arXiv Detail & Related papers (2023-05-04T03:51:31Z) - Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis [103.89753784762445]
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT)
This paper systematically investigates the advantages and challenges of LLMs for MMT.
We thoroughly evaluate eight popular LLMs, including ChatGPT and GPT-4.
arXiv Detail & Related papers (2023-04-10T15:51:30Z) - Document-Level Machine Translation with Large Language Models [91.03359121149595]
Large language models (LLMs) can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks.
This paper provides an in-depth evaluation of LLMs' ability on discourse modeling.
arXiv Detail & Related papers (2023-04-05T03:49:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.