Towards a Unified System of Representation for Continuity and Discontinuity in Natural Language
- URL: http://arxiv.org/abs/2506.05235v1
- Date: Thu, 05 Jun 2025 16:54:41 GMT
- Title: Towards a Unified System of Representation for Continuity and Discontinuity in Natural Language
- Authors: Ratna Kandala, Prakash Mondal,
- Abstract summary: We propose a unified system of representation for both continuity and discontinuity in structures of natural languages.<n>We take into account three formalisms, in particular, Phrase Structure Grammar (PSG) for its notion of constituency, Dependency Grammar (DG) for its head-dependent relations, and Categorial Grammar (CG) for its focus on functor-argument relations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Syntactic discontinuity is a grammatical phenomenon in which a constituent is split into more than one part because of the insertion of an element which is not part of the constituent. This is observed in many languages across the world such as Turkish, Russian, Japanese, Warlpiri, Navajo, Hopi, Dyirbal, Yidiny etc. Different formalisms/frameworks in current linguistic theory approach the problem of discontinuous structures in different ways. Each framework/formalism has widely been viewed as an independent and non-converging system of analysis. In this paper, we propose a unified system of representation for both continuity and discontinuity in structures of natural languages by taking into account three formalisms, in particular, Phrase Structure Grammar (PSG) for its widely used notion of constituency, Dependency Grammar (DG) for its head-dependent relations, and Categorial Grammar (CG) for its focus on functor-argument relations. We attempt to show that discontinuous expressions as well as continuous structures can be analysed through a unified mathematical derivation incorporating the representations of linguistic structure in these three grammar formalisms.
Related papers
- Counting trees: A treebank-driven exploration of syntactic variation in speech and writing across languages [0.0]
We define syntactic structures as delexicalized dependency (sub)trees and extract them from spoken and written Universal Dependencies treebanks.<n>For each corpus, we analyze the size, diversity, and distribution of syntactic inventories, their overlap across modalities, and the structures most characteristic of speech.<n>Results show that, across both languages, spoken corpora contain fewer and less diverse syntactic structures than their written counterparts.
arXiv Detail & Related papers (2025-05-28T18:43:26Z) - Composing or Not Composing? Towards Distributional Construction Grammars [47.636049672406145]
Building the meaning of a linguistic utterance is incremental, step-by-step, based on a compositional process.<n>It is therefore necessary to propose a framework bringing together both approaches.<n>We present an approach based on Construction Grammars and completing this framework in order to account for these different mechanisms.
arXiv Detail & Related papers (2024-12-10T11:17:02Z) - A Complexity-Based Theory of Compositionality [53.025566128892066]
In AI, compositional representations can enable a powerful form of out-of-distribution generalization.<n>Here, we propose a definition, which we call representational compositionality, that accounts for and extends our intuitions about compositionality.<n>We show how it unifies disparate intuitions from across the literature in both AI and cognitive science.
arXiv Detail & Related papers (2024-10-18T18:37:27Z) - The Problem of Alignment [1.2277343096128712]
Large Language Models produce sequences learned as statistical patterns from large corpora.
After initial training models must be aligned with human values, prefer certain continuations over others.
We examine this practice of structuration as a two-way interaction between users and models.
arXiv Detail & Related papers (2023-12-30T11:44:59Z) - "You Are An Expert Linguistic Annotator": Limits of LLMs as Analyzers of
Abstract Meaning Representation [60.863629647985526]
We examine the successes and limitations of the GPT-3, ChatGPT, and GPT-4 models in analysis of sentence meaning structure.
We find that models can reliably reproduce the basic format of AMR, and can often capture core event, argument, and modifier structure.
Overall, our findings indicate that these models out-of-the-box can capture aspects of semantic structure, but there remain key limitations in their ability to support fully accurate semantic analyses or parses.
arXiv Detail & Related papers (2023-10-26T21:47:59Z) - Geometry of Language [0.0]
We present a fresh perspective on language, combining ideas from various sources, but mixed in a new synthesis.
The question is whether we can formulate an elegant formalism, a universal grammar or a mechanism which explains significant aspects of the human faculty of language.
We describe such a mechanism, which differs from existing logical and grammatical approaches by its geometric nature.
arXiv Detail & Related papers (2023-03-09T12:22:28Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Oracle Linguistic Graphs Complement a Pretrained Transformer Language
Model: A Cross-formalism Comparison [13.31232311913236]
We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling.
We find that, overall, semantic constituency structures are most useful to language modeling performance.
arXiv Detail & Related papers (2021-12-15T04:29:02Z) - Plurality and Quantification in Graph Representation of Meaning [4.82512586077023]
Our graph language covers the essentials of natural language semantics using only monadic second-order variables.
We present a unification-based mechanism for constructing semantic graphs at a simple syntax-semantics interface.
The present graph formalism is applied to linguistic issues in distributive predication, cross-categorial conjunction, and scope permutation of quantificational expressions.
arXiv Detail & Related papers (2021-12-13T07:04:41Z) - Decomposing lexical and compositional syntax and semantics with deep
language models [82.81964713263483]
The activations of language transformers like GPT2 have been shown to linearly map onto brain activity during speech comprehension.
Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four classes: lexical, compositional, syntactic, and semantic representations.
The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices.
arXiv Detail & Related papers (2021-03-02T10:24:05Z) - Constructing a Family Tree of Ten Indo-European Languages with
Delexicalized Cross-linguistic Transfer Patterns [57.86480614673034]
We formalize the delexicalized transfer as interpretable tree-to-string and tree-to-tree patterns.
This allows us to quantitatively probe cross-linguistic transfer and extend inquiries of Second Language Acquisition.
arXiv Detail & Related papers (2020-07-17T15:56:54Z) - Where New Words Are Born: Distributional Semantic Analysis of Neologisms
and Their Semantic Neighborhoods [51.34667808471513]
We investigate the importance of two factors, semantic sparsity and frequency growth rates of semantic neighbors, formalized in the distributional semantics paradigm.
We show that both factors are predictive word emergence although we find more support for the latter hypothesis.
arXiv Detail & Related papers (2020-01-21T19:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.