Theoretical and Methodological Framework for Studying Texts Produced by Large Language Models
- URL: http://arxiv.org/abs/2408.16740v1
- Date: Thu, 29 Aug 2024 17:34:10 GMT
- Title: Theoretical and Methodological Framework for Studying Texts Produced by Large Language Models
- Authors: Jiří Milička,
- Abstract summary: This paper addresses the conceptual, methodological and technical challenges in studying large language models (LLMs)
It builds on a theoretical framework that distinguishes between the LLM as a substrate and the entities the model simulates.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the conceptual, methodological and technical challenges in studying large language models (LLMs) and the texts they produce from a quantitative linguistics perspective. It builds on a theoretical framework that distinguishes between the LLM as a substrate and the entities the model simulates. The paper advocates for a strictly non-anthropomorphic approach to models while cautiously applying methodologies used in studying human linguistic behavior to the simulated entities. While natural language processing researchers focus on the models themselves, their architecture, evaluation, and methods for improving performance, we as quantitative linguists should strive to build a robust theory concerning the characteristics of texts produced by LLMs, how they differ from human-produced texts, and the properties of simulated entities. Additionally, we should explore the potential of LLMs as an instrument for studying human culture, of which language is an integral part.
Related papers
- Linguistically Grounded Analysis of Language Models using Shapley Head Values [2.914115079173979]
We investigate the processing of morphosyntactic phenomena by leveraging a recently proposed method for probing language models via Shapley Head Values (SHVs)
Using the English language BLiMP dataset, we test our approach on two widely used models, BERT and RoBERTa, and compare how linguistic constructions are handled.
Our results show that SHV-based attributions reveal distinct patterns across both models, providing insights into how language models organize and process linguistic information.
arXiv Detail & Related papers (2024-10-17T09:48:08Z) - Language Evolution with Deep Learning [49.879239655532324]
Computational modeling plays an essential role in the study of language emergence.
It aims to simulate the conditions and learning processes that could trigger the emergence of a structured language.
This chapter explores another class of computational models that have recently revolutionized the field of machine learning: deep learning models.
arXiv Detail & Related papers (2024-03-18T16:52:54Z) - Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z) - Large language models as linguistic simulators and cognitive models in human research [0.0]
The rise of large language models (LLMs) that generate human-like text has sparked debates over their potential to replace human participants in behavioral and cognitive research.
We critically evaluate this replacement perspective to appraise the fundamental utility of language models in psychology and social science.
This perspective reframes the role of language models in behavioral and cognitive science, serving as linguistic simulators and cognitive models that shed light on the similarities and differences between machine intelligence and human cognition and thoughts.
arXiv Detail & Related papers (2024-02-06T23:28:23Z) - Formal Aspects of Language Modeling [74.16212987886013]
Large language models have become one of the most commonly deployed NLP inventions.
These notes are the accompaniment to the theoretical portion of the ETH Z"urich course on large language models.
arXiv Detail & Related papers (2023-11-07T20:21:42Z) - L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs)
We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods.
In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Large Linguistic Models: Analyzing theoretical linguistic abilities of
LLMs [7.4815059492034335]
We show that large language models can generate coherent and valid formal analyses of linguistic data.
We focus on three subfields of formal linguistics: syntax, phonology, and semantics.
This line of inquiry exemplifies behavioral interpretability of deep learning, where models' representations are accessed by explicit prompting.
arXiv Detail & Related papers (2023-05-01T17:09:33Z) - An Overview on Language Models: Recent Developments and Outlook [32.528770408502396]
Conventional language models (CLMs) aim to predict the probability of linguistic sequences in a causal manner.
Pre-trained language models (PLMs) cover broader concepts and can be used in both causal sequential modeling and fine-tuning for downstream applications.
arXiv Detail & Related papers (2023-03-10T07:55:00Z) - Explaining Large Language Model-Based Neural Semantic Parsers (Student
Abstract) [0.0]
Large language models (LLMs) have demonstrated strong capability in structured prediction tasks such as semantic parsing.
Our work studies different methods for explaining an LLM-based semantic semantic behaviors.
We hope to inspire future research toward better understanding them.
arXiv Detail & Related papers (2023-01-25T16:12:43Z) - LERT: A Linguistically-motivated Pre-trained Language Model [67.65651497173998]
We propose LERT, a pre-trained language model that is trained on three types of linguistic features along with the original pre-training task.
We carried out extensive experiments on ten Chinese NLU tasks, and the experimental results show that LERT could bring significant improvements.
arXiv Detail & Related papers (2022-11-10T05:09:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.