Related papers: Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective

Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective

URL: http://arxiv.org/abs/2305.15057v2
Date: Tue, 12 Dec 2023 08:46:11 GMT
Title: Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective
Authors: Tianyu Liu, Afra Amini, Mrinmaya Sachan, Ryan Cotterell
Abstract summary: Tasks that model the relation between pairs of tokens in a string are a vital part of understanding natural language. We show that these exhaustive comparisons can be avoided, and, moreover, the complexity can be reduced to linear by casting the relation between tokens as a partial order over the string. Our method predicts real numbers for each token in a string in parallel and sorts the tokens accordingly, resulting in total orders of the tokens in the string.
Score: 97.57162770792182
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Tasks that model the relation between pairs of tokens in a string are a vital part of understanding natural language. Such tasks, in general, require exhaustive pair-wise comparisons of tokens, thus having a quadratic runtime complexity in the length of the string. We show that these exhaustive comparisons can be avoided, and, moreover, the complexity of such tasks can be reduced to linear by casting the relation between tokens as a partial order over the string. Our method predicts real numbers for each token in a string in parallel and sorts the tokens accordingly, resulting in total orders of the tokens in the string. Each total order implies a set of arcs oriented from smaller to greater tokens, sorted by their predicted numbers. The intersection of total orders results in a partial order over the set of tokens in the string, which is then decoded into a directed graph representing the desired linguistic structure. Our experiments on dependency parsing and coreference resolution show that our method achieves state-of-the-art or comparable performance. Moreover, the linear complexity and parallelism of our method double the speed of graph-based coreference resolution models, and bring a 10-times speed-up over graph-based dependency parsers.

Related papers

Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations [83.93566096400723]
We find that instruction-tuned models retain up to 93.4% of their original performance when given a randomly sampled tokenization.<n>Character-level segmentation improves string manipulation and code understanding tasks by up to +14%.<n>Right-aligned digit grouping enhances large-number arithmetic by +33%.
arXiv Detail & Related papers (2025-06-23T18:02:26Z)
A New Graph Grammar Formalism for Robust Syntactic Pattern Recognition [0.0]
It does not use production rules, like a conventional graph grammar, but represents the syntactic structure in a more direct and declarative way. The grammar and the pattern are both represented as networks, and parsing is seen as the construction of a homomorphism from the pattern to the grammar.
arXiv Detail & Related papers (2025-04-22T15:23:37Z)
From Language Models over Tokens to Language Models over Characters [54.123846188068384]
Modern language models are internally -- and mathematically -- distributions over token strings rather than emphcharacter strings. This paper presents algorithms for converting token-level language models to character-level ones.
arXiv Detail & Related papers (2024-12-04T21:19:20Z)
Tokenization as Finite-State Transduction [24.19959327497118]
We introduce a finite-state framework which can efficiently encode all possible tokenizations of a regular language. We show that Byte-Pair. Match (BPE) and MaxPiece (WordPiece) fit within this framework. An application of this is to guided generation, where the outputs of a language model are constrained to match some pattern.
arXiv Detail & Related papers (2024-10-21T07:10:07Z)
An Expression Tree Decoding Strategy for Mathematical Equation Generation [24.131972875875952]
Existing approaches can be broadly categorized into token-level and expression-level generation. Expression-level methods generate each expression one by one. Each expression represents a solving step, and there naturally exist parallel or dependent relations between these steps. We integrate tree structure into the expression-level generation and advocate an expression tree decoding strategy.
arXiv Detail & Related papers (2023-10-14T17:00:28Z)
Tree Cross Attention [59.8891512435847]
Tree Cross Attention (TCA) is a module based on Cross Attention that only retrieves information from a logarithmic $mathcalO(log(N))$ number of tokens for performing inference. We show that TCA performs comparable to Cross Attention across various classification and uncertainty regression tasks while being significantly more token-efficient.
arXiv Detail & Related papers (2023-09-29T16:50:23Z)
Hexatagging: Projective Dependency Parsing as Tagging [63.5392760743851]
We introduce a novel dependency, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags. Our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other. We achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set.
arXiv Detail & Related papers (2023-06-08T18:02:07Z)
Compositional Generalization without Trees using Multiset Tagging and Latent Permutations [121.37328648951993]
We phrase semantic parsing as a two-step process: we first tag each input token with a multiset of output tokens. Then we arrange the tokens into an output sequence using a new way of parameterizing and predicting permutations. Our model outperforms pretrained seq2seq models and prior work on realistic semantic parsing tasks.
arXiv Detail & Related papers (2023-05-26T14:09:35Z)
Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation [61.50286000143233]
ChainCoder is a program synthesis language model that generates Python code progressively. A tailored transformer architecture is leveraged to jointly encode the natural language descriptions and syntactically aligned I/O data samples.
arXiv Detail & Related papers (2023-04-28T01:47:09Z)
A Conditional Splitting Framework for Efficient Constituency Parsing [14.548146390081778]
We introduce a generic seq2seq parsing framework that casts constituency parsing problems (syntactic and discourse parsing) into a series of conditional splitting decisions. Our parsing model estimates the conditional probability distribution of possible splitting points in a given text span and supports efficient top-down decoding. For discourse analysis we show that in our formulation, discourse segmentation can be framed as a special case of parsing.
arXiv Detail & Related papers (2021-06-30T00:36:34Z)
An Iterative Contextualization Algorithm with Second-Order Attention [0.40611352512781856]
We show how to combine the representations of words that make up a sentence into a cohesive whole. Our algorithm starts with a presumably erroneous value of the context, and adjusts this value with respect to the tokens at hand. Our models report strong results in several well-known text classification tasks.
arXiv Detail & Related papers (2021-03-03T05:34:50Z)
A Simple Global Neural Discourse Parser [61.728994693410954]
We propose a simple chart-based neural discourse that does not require any manually-crafted features and is based on learned span representations only. We empirically demonstrate that our model achieves the best performance among globals, and comparable performance to state-of-art greedys.
arXiv Detail & Related papers (2020-09-02T19:28:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.