Related papers: Congruence-based Learning of Probabilistic Deterministic Finite Automata

Congruence-based Learning of Probabilistic Deterministic Finite Automata

URL: http://arxiv.org/abs/2412.09760v1
Date: Thu, 12 Dec 2024 23:38:58 GMT
Title: Congruence-based Learning of Probabilistic Deterministic Finite Automata
Authors: Matías Carrasco, Franz Mayr, Sergio Yovine,
Abstract summary: We introduce a congruence that extends the classical Myhill-Nerode congruence for formal languages.<n>This new congruence is the basis for defining regularity over language models.<n>We present an active learning algorithm that computes the quotient with respect to this congruence whenever the language model is regular.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work studies the question of learning probabilistic deterministic automata from language models. For this purpose, it focuses on analyzing the relations defined on algebraic structures over strings by equivalences and similarities on probability distributions. We introduce a congruence that extends the classical Myhill-Nerode congruence for formal languages. This new congruence is the basis for defining regularity over language models. We present an active learning algorithm that computes the quotient with respect to this congruence whenever the language model is regular. The paper also defines the notion of recognizability for language models and shows that it coincides with regularity for congruences. For relations which are not congruences, it shows that this is not the case. Finally, it discusses the impact of this result on learning in the context of language models.

Related papers

Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
We train and evaluate neural networks directly as binary classifiers of strings. We provide results on a variety of languages across the Chomsky hierarchy for three neural architectures. Our contributions will facilitate theoretically sound empirical testing of language recognition claims in future work.
arXiv Detail & Related papers (2024-11-11T16:33:25Z)
Are LLMs Models of Distributional Semantics? A Case Study on Quantifiers [14.797001158310092]
We argue that distributional semantics models struggle with truth-conditional reasoning and symbolic processing. Contrary to expectations, we find that LLMs align more closely with human judgements on exact quantifiers versus vague ones.
arXiv Detail & Related papers (2024-10-17T19:28:35Z)
Are language models rational? The case of coherence norms and belief revision [63.78798769882708]
We consider logical coherence norms as well as coherence norms tied to the strength of belief in language models. We argue that rational norms tied to coherence do apply to some language models, but not to others.
arXiv Detail & Related papers (2024-06-05T16:36:21Z)
From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought [124.40905824051079]
We propose rational meaning construction, a computational framework for language-informed thinking. We frame linguistic meaning as a context-sensitive mapping from natural language into a probabilistic language of thought. We show that LLMs can generate context-sensitive translations that capture pragmatically-appropriate linguistic meanings. We extend our framework to integrate cognitively-motivated symbolic modules.
arXiv Detail & Related papers (2023-06-22T05:14:00Z)
Lexinvariant Language Models [84.2829117441298]
Token embeddings, a mapping from discrete lexical symbols to continuous vectors, are at the heart of any language model (LM) We study textitlexinvariantlanguage models that are invariant to lexical symbols and therefore do not need fixed token embeddings in practice. We show that a lexinvariant LM can attain perplexity comparable to that of a standard language model, given a sufficiently long context.
arXiv Detail & Related papers (2023-05-24T19:10:46Z)
A Measure-Theoretic Characterization of Tight Language Models [105.16477132329416]
In some pathological cases, probability mass can leak'' onto the set of infinite sequences. This paper offers a measure-theoretic treatment of language modeling. We prove that many popular language model families are in fact tight, meaning that they will not leak in this sense.
arXiv Detail & Related papers (2022-12-20T18:17:11Z)
The unstable formula theorem revisited via algorithms [18.557423328068122]
In response to gaps in existing learning models, we introduce a new statistical learning model, called Probably Eventually Correct'' or PEC.<n>We characterize Littlestone (stable) classes in terms of this model.<n>In order to obtain a characterization of Littlestone classes in terms of frequent definitions, we build an equivalence theorem highlighting what is common to many existing approximation algorithms.
arXiv Detail & Related papers (2022-12-09T18:53:34Z)
Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z)
Montague semantics and modifier consistency measurement in neural language models [1.6799377888527685]
This work proposes a methodology for measuring compositional behavior in contemporary language models. Specifically, we focus on adjectival modifier phenomena in adjective-noun phrases. Our experimental results indicate that current neural language models behave according to the expected linguistic theories to a limited extent only.
arXiv Detail & Related papers (2022-10-10T18:43:16Z)
Universality and diversity in word patterns [0.0]
We present an analysis of lexical statistical connections for eleven major languages. We find that the diverse manners that languages utilize to express word relations give rise to unique pattern distributions.
arXiv Detail & Related papers (2022-08-23T20:03:27Z)
Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words. Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z)
Modelling Compositionality and Structure Dependence in Natural Language [0.12183405753834563]
Drawing on linguistics and set theory, a formalisation of these ideas is presented in the first half of this thesis. We see how cognitive systems that process language need to have certain functional constraints. Using the advances of word embedding techniques, a model of relational learning is simulated.
arXiv Detail & Related papers (2020-11-22T17:28:50Z)
Model-theoretic Characterizations of Existential Rule Languages [9.845164265154832]
Existential rules, a.k.a. dependencies in databases, are a family of important logical languages widely used in computer science and artificial intelligence. We establish model-theoretic characterizations for a number of existential rule languages such as (disjunctive) embedded dependencies,generating dependencies (TGDs), (frontier-)guarded TGDs and linear TGDs. As a natural application of these characterizations, complexity bounds for the rewritability of above languages are also identified.
arXiv Detail & Related papers (2020-01-23T17:29:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.