Related papers: Formal Aspects of Language Modeling

Formal Aspects of Language Modeling

URL: http://arxiv.org/abs/2311.04329v2
Date: Wed, 17 Apr 2024 07:31:01 GMT
Title: Formal Aspects of Language Modeling
Authors: Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, Li Du,
Abstract summary: Large language models have become one of the most commonly deployed NLP inventions. These notes are the accompaniment to the theoretical portion of the ETH Z"urich course on large language models.
Score: 74.16212987886013
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models have become one of the most commonly deployed NLP inventions. In the past half-decade, their integration into core natural language processing tools has dramatically increased the performance of such tools, and they have entered the public discourse surrounding artificial intelligence. Consequently, it is important for both developers and researchers alike to understand the mathematical foundations of large language models, as well as how to implement them. These notes are the accompaniment to the theoretical portion of the ETH Z\"urich course on large language models, covering what constitutes a language model from a formal, theoretical perspective.

Related papers

Proceedings of the First International Workshop on Next-Generation Language Models for Knowledge Representation and Reasoning (NeLaMKRR 2024) [16.282850445579857]
Reasoning is an essential component of human intelligence as it plays a fundamental role in our ability to think critically. Recent leap forward in natural language processing, with the emergence of language models based on transformers, is hinting at the possibility that these models exhibit reasoning abilities. Despite ongoing discussions about what reasoning is in language models, it is still not easy to pin down to what extent these models are actually capable of reasoning.
arXiv Detail & Related papers (2024-10-07T02:31:47Z)
The Sociolinguistic Foundations of Language Modeling [34.02231580843069]
We argue that large language models are inherently models of varieties of language. We discuss how this perspective can help address five basic challenges in language modeling.
arXiv Detail & Related papers (2024-07-12T13:12:55Z)
Gl\'orIA - A Generative and Open Large Language Model for Portuguese [4.782288068552145]
We introduce Gl'orIA, a robust European Portuguese decoder LLM. To pre-train Gl'orIA, we assembled a comprehensive PT-PT text corpus comprising 35 billion tokens from various sources. Evaluation shows that Gl'orIA significantly outperforms existing open PT decoder models in language modeling.
arXiv Detail & Related papers (2024-02-20T12:36:40Z)
Large Linguistic Models: Analyzing theoretical linguistic abilities of LLMs [7.4815059492034335]
We show that large language models can generate coherent and valid formal analyses of linguistic data. We focus on three subfields of formal linguistics: syntax, phonology, and semantics. This line of inquiry exemplifies behavioral interpretability of deep learning, where models' representations are accessed by explicit prompting.
arXiv Detail & Related papers (2023-05-01T17:09:33Z)
A Survey of Large Language Models [81.06947636926638]
Language modeling has been widely studied for language understanding and generation in the past two decades. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.
arXiv Detail & Related papers (2023-03-31T17:28:46Z)
On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex [48.588772371355816]
This paper presents the first empirical study on the adversarial robustness of a large prompt-based language model of code, codex. Our results demonstrate that the state-of-the-art (SOTA) code-language models are vulnerable to carefully crafted adversarial examples.
arXiv Detail & Related papers (2023-01-30T13:21:00Z)
Language Models are General-Purpose Interfaces [109.45478241369655]
We propose to use language models as a general-purpose interface to various foundation models. A collection of pretrained encoders perceive diverse modalities (such as vision, and language) We propose a semi-causal language modeling objective to jointly pretrain the interface and the modular encoders.
arXiv Detail & Related papers (2022-06-13T17:34:22Z)
Language Models are not Models of Language [0.0]
Transfer learning has enabled large deep learning neural networks trained on the language modeling task to vastly improve performance. We argue that the term language model is misleading because deep learning models are not theoretical models of language.
arXiv Detail & Related papers (2021-12-13T22:39:46Z)
Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages. We infer this distribution from a sample of typologically diverse training languages. We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z)
Constrained Language Models Yield Few-Shot Semantic Parsers [73.50960967598654]
We explore the use of large pretrained language models as few-shot semantics. The goal in semantic parsing is to generate a structured meaning representation given a natural language input. We use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation.
arXiv Detail & Related papers (2021-04-18T08:13:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.