A Philosophical Introduction to Language Models -- Part I: Continuity
With Classic Debates
- URL: http://arxiv.org/abs/2401.03910v1
- Date: Mon, 8 Jan 2024 14:12:31 GMT
- Title: A Philosophical Introduction to Language Models -- Part I: Continuity
With Classic Debates
- Authors: Rapha\"el Milli\`ere, Cameron Buckner
- Abstract summary: This article serves both as a primer on language models for philosophers, and as an opinionated survey of their significance.
We argue that the success of language models challenges several long-held assumptions about artificial neural networks.
This sets the stage for the companion paper (Part II), which turns to novel empirical methods for probing the inner workings of language models.
- Score: 0.05657375260432172
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models like GPT-4 have achieved remarkable proficiency in a
broad spectrum of language-based tasks, some of which are traditionally
associated with hallmarks of human intelligence. This has prompted ongoing
disagreements about the extent to which we can meaningfully ascribe any kind of
linguistic or cognitive competence to language models. Such questions have deep
philosophical roots, echoing longstanding debates about the status of
artificial neural networks as cognitive models. This article -- the first part
of two companion papers -- serves both as a primer on language models for
philosophers, and as an opinionated survey of their significance in relation to
classic debates in the philosophy cognitive science, artificial intelligence,
and linguistics. We cover topics such as compositionality, language
acquisition, semantic competence, grounding, world models, and the transmission
of cultural knowledge. We argue that the success of language models challenges
several long-held assumptions about artificial neural networks. However, we
also highlight the need for further empirical investigation to better
understand their internal mechanisms. This sets the stage for the companion
paper (Part II), which turns to novel empirical methods for probing the inner
workings of language models, and new philosophical questions prompted by their
latest developments.
Related papers
- Epistemology of Language Models: Do Language Models Have Holistic Knowledge? [30.02796959216552]
This paper investigates the inherent knowledge in language models from the perspective of holism.
The purpose of this paper is to explore whether characteristics of language models exhibit consistent with holism.
arXiv Detail & Related papers (2024-03-19T16:06:10Z) - Language Evolution with Deep Learning [49.879239655532324]
Computational modeling plays an essential role in the study of language emergence.
It aims to simulate the conditions and learning processes that could trigger the emergence of a structured language.
This chapter explores another class of computational models that have recently revolutionized the field of machine learning: deep learning models.
arXiv Detail & Related papers (2024-03-18T16:52:54Z) - Visually Grounded Language Learning: a review of language games,
datasets, tasks, and models [60.2604624857992]
Many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality.
In this work, we provide a systematic literature review of several tasks and models proposed in the V+L field.
arXiv Detail & Related papers (2023-12-05T02:17:29Z) - From Word Models to World Models: Translating from Natural Language to
the Probabilistic Language of Thought [124.40905824051079]
We propose rational meaning construction, a computational framework for language-informed thinking.
We frame linguistic meaning as a context-sensitive mapping from natural language into a probabilistic language of thought.
We show that LLMs can generate context-sensitive translations that capture pragmatically-appropriate linguistic meanings.
We extend our framework to integrate cognitively-motivated symbolic modules.
arXiv Detail & Related papers (2023-06-22T05:14:00Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Dissociating language and thought in large language models [52.39241645471213]
Large Language Models (LLMs) have come closest among all models to date to mastering human language.
We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms.
Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty.
arXiv Detail & Related papers (2023-01-16T22:41:19Z) - Language Models as Inductive Reasoners [125.99461874008703]
We propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts.
We create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language.
We provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts.
arXiv Detail & Related papers (2022-12-21T11:12:14Z) - Integrating Linguistic Theory and Neural Language Models [2.870517198186329]
I present several case studies to illustrate how theoretical linguistics and neural language models are still relevant to each other.
This thesis contributes three studies that explore different aspects of the syntax-semantics interface in language models.
arXiv Detail & Related papers (2022-07-20T04:20:46Z) - Schr\"odinger's Tree -- On Syntax and Neural Language Models [10.296219074343785]
Language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities.
We observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form.
We outline the implications of the different types of research questions exhibited in studies on syntax.
arXiv Detail & Related papers (2021-10-17T18:25:23Z) - The Rediscovery Hypothesis: Language Models Need to Meet Linguistics [8.293055016429863]
We study whether linguistic knowledge is a necessary condition for good performance of modern language models.
We show that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures.
This result supports the rediscovery hypothesis and leads to the second contribution of our paper: an information-theoretic framework that relates language modeling objective with linguistic information.
arXiv Detail & Related papers (2021-03-02T15:57:39Z) - Language Modelling as a Multi-Task Problem [12.48699285085636]
We investigate whether language models adhere to learning principles of multi-task learning during training.
Experiments demonstrate that a multi-task setting naturally emerges within the objective of the more general task of language modelling.
arXiv Detail & Related papers (2021-01-27T09:47:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.