SPINDLE: Spinning Raw Text into Lambda Terms with Graph Attention
- URL: http://arxiv.org/abs/2302.12050v1
- Date: Thu, 23 Feb 2023 14:22:45 GMT
- Title: SPINDLE: Spinning Raw Text into Lambda Terms with Graph Attention
- Authors: Konstantinos Kogkalidis, Michael Moortgat, Richard Moot
- Abstract summary: The module transforms raw text input to programs for meaning composition, expressed as lambda terms.
Its output consists of hi-res derivations of a multimodal type-logical grammar.
- Score: 0.8379286663107844
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper describes SPINDLE - an open source Python module implementing an
efficient and accurate parser for written Dutch that transforms raw text input
to programs for meaning composition, expressed as {\lambda} terms. The parser
integrates a number of breakthrough advances made in recent years. Its output
consists of hi-res derivations of a multimodal type-logical grammar, capturing
two orthogonal axes of syntax, namely deep function-argument structures and
dependency relations. These are produced by three interdependent systems: a
static type-checker asserting the well-formedness of grammatical analyses, a
state-of-the-art, structurally-aware supertagger based on heterogeneous graph
convolutions, and a massively parallel proof search component based on Sinkhorn
iterations. Packed in the software are also handy utilities and extras for
proof visualization and inference, intended to facilitate end-user utilization.
Related papers
- Integrating Supertag Features into Neural Discontinuous Constituent Parsing [0.0]
Traditional views of constituency demand that constituents consist of adjacent words, common in languages like German.
Transition-based parsing produces trees given raw text input using supervised learning on large annotated corpora.
arXiv Detail & Related papers (2024-10-11T12:28:26Z) - PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer [51.260384040953326]
Handwritten Mathematical Expression Recognition (HMER) has wide applications in human-machine interaction scenarios.
We propose a position forest transformer (PosFormer) for HMER, which jointly optimize two tasks: expression recognition and position recognition.
PosFormer consistently outperforms the state-of-the-art methods 2.03%/1.22%/2, 1.83%, and 4.62% gains on datasets.
arXiv Detail & Related papers (2024-07-10T15:42:58Z) - OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition [79.852642726105]
We propose a unified paradigm for parsing visually-situated text across diverse scenarios.
Specifically, we devise a universal model, called Omni, which can simultaneously handle three typical visually-situated text parsing tasks.
In Omni, all tasks share the unified encoder-decoder architecture, the unified objective point-conditioned text generation, and the unified input representation.
arXiv Detail & Related papers (2024-03-28T03:51:14Z) - An Autoregressive Text-to-Graph Framework for Joint Entity and Relation
Extraction [4.194768796374315]
We propose a novel method for joint entity and relation extraction from unstructured text by framing it as a conditional sequence generation problem.
It generates a linearized graph where nodes represent text spans and edges represent relation triplets.
Our method employs a transformer encoder-decoder architecture with pointing mechanism on a dynamic vocabulary of spans and relation types.
arXiv Detail & Related papers (2024-01-02T18:32:14Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - BASS: Boosting Abstractive Summarization with Unified Semantic Graph [49.48925904426591]
BASS is a framework for Boosting Abstractive Summarization based on a unified Semantic graph.
A graph-based encoder-decoder model is proposed to improve both the document representation and summary generation process.
Empirical results show that the proposed architecture brings substantial improvements for both long-document and multi-document summarization tasks.
arXiv Detail & Related papers (2021-05-25T16:20:48Z) - Syntactic representation learning for neural network based TTS with
syntactic parse tree traversal [49.05471750563229]
We propose a syntactic representation learning method based on syntactic parse tree to automatically utilize the syntactic structure information.
Experimental results demonstrate the effectiveness of our proposed approach.
For sentences with multiple syntactic parse trees, prosodic differences can be clearly perceived from the synthesized speeches.
arXiv Detail & Related papers (2020-12-13T05:52:07Z) - Interactive Text Graph Mining with a Prolog-based Dialog Engine [8.663755202726795]
We design a Prolog-based dialog engine that explores interactively a ranked fact database extracted from a text document.
We take advantage of the implicit semantic information that dependency links and WordNet bring in the form of subject-verb-object, is-a and part-of relations.
arXiv Detail & Related papers (2020-07-31T03:29:49Z) - pyBART: Evidence-based Syntactic Transformations for IE [52.93947844555369]
We present pyBART, an easy-to-use open-source Python library for converting English UD trees to Enhanced UD graphs or to our representation.
When evaluated in a pattern-based relation extraction scenario, our representation results in higher extraction scores than Enhanced UD, while requiring fewer patterns.
arXiv Detail & Related papers (2020-05-04T07:38:34Z) - Selective Attention Encoders by Syntactic Graph Convolutional Networks
for Document Summarization [21.351111598564987]
We propose a graph to connect the parsing trees from the sentences in a document and utilize the stacked graph convolutional networks (GCNs) to learn the syntactic representation for a document.
The proposed GCNs based selective attention approach outperforms the baselines and achieves the state-of-the-art performance on the dataset.
arXiv Detail & Related papers (2020-03-18T01:30:02Z) - \AE THEL: Automatically Extracted Typelogical Derivations for Dutch [0.8379286663107844]
AETHEL is a semantic compositionality for written Dutch.
AETHEL's types and derivations are obtained by means of an extraction algorithm applied to the syntactic analyses of LASSY Small.
arXiv Detail & Related papers (2019-12-29T11:31:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.