Related papers: Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures

Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures

URL: http://arxiv.org/abs/2305.18915v1
Date: Tue, 30 May 2023 10:09:48 GMT
Title: Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures
Authors: Jakob Prange and Emmanuele Chersoni
Abstract summary: We design a concise binary vector representation of semantic structure at the lexical level. We evaluate in-depth how good an incremental tagger needs to be in order to achieve better-than-baseline performance.
Score: 4.29295838853865
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this work we build upon negative results from an attempt at language modeling with predicted semantic structure, in order to establish empirical lower bounds on what could have made the attempt successful. More specifically, we design a concise binary vector representation of semantic structure at the lexical level and evaluate in-depth how good an incremental tagger needs to be in order to achieve better-than-baseline performance with an end-to-end semantic-bootstrapping language model. We envision such a system as consisting of a (pretrained) sequential-neural component and a hierarchical-symbolic component working together to generate text with low surprisal and high linguistic interpretability. We find that (a) dimensionality of the semantic vector representation can be dramatically reduced without losing its main advantages and (b) lower bounds on prediction quality cannot be established via a single score alone, but need to take the distributions of signal and noise into account.

Related papers

On Eliciting Syntax from Language Models via Hashing [19.872554909401316]
Unsupervised parsing aims to infer syntactic structure from raw text. In this paper, we explore the possibility of leveraging this capability to deduce parsing trees from raw text. We show that our method is effective and efficient enough to acquire high-quality parsing trees from pre-trained language models at a low cost.
arXiv Detail & Related papers (2024-10-05T08:06:19Z)
Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training. It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby. It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z)
Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z)
Constructing Word-Context-Coupled Space Aligned with Associative Knowledge Relations for Interpretable Language Modeling [0.0]
The black-box structure of the deep neural network in pre-trained language models seriously limits the interpretability of the language modeling process. A Word-Context-Coupled Space (W2CSpace) is proposed by introducing the alignment processing between uninterpretable neural representation and interpretable statistical logic. Our language model can achieve better performance and highly credible interpretable ability compared to related state-of-the-art methods.
arXiv Detail & Related papers (2023-05-19T09:26:02Z)
On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex [48.588772371355816]
This paper presents the first empirical study on the adversarial robustness of a large prompt-based language model of code, codex. Our results demonstrate that the state-of-the-art (SOTA) code-language models are vulnerable to carefully crafted adversarial examples.
arXiv Detail & Related papers (2023-01-30T13:21:00Z)
Sentence Representation Learning with Generative Objective rather than Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction. Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z)
Transferring Semantic Knowledge Into Language Encoders [6.85316573653194]
We introduce semantic form mid-tuning, an approach for transferring semantic knowledge from semantic meaning representations into language encoders. We show that this alignment can be learned implicitly via classification or directly via triplet loss. Our method yields language encoders that demonstrate improved predictive performance across inference, reading comprehension, textual similarity, and other semantic tasks.
arXiv Detail & Related papers (2021-10-14T14:11:12Z)
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units [19.668440671541546]
In end-to-end automatic speech recognition, a model is expected to implicitly learn representations suitable for recognizing a word-level sequence. We propose a hierarchical conditional model that is based on connectionist temporal classification ( CTC) Experimental results on LibriSpeech-100h, 960h and TEDLIUM2 demonstrate that the proposed model improves over a standard CTC-based model.
arXiv Detail & Related papers (2021-10-08T13:15:58Z)
Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models. We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z)
SLM: Learning a Discourse Language Representation with Sentence Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation. We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
Discontinuous Constituent Parsing with Pointer Networks [0.34376560669160383]
discontinuous constituent trees are crucial for representing all grammatical phenomena of languages such as German. Recent advances in dependency parsing have shown that Pointer Networks excel in efficiently parsing syntactic relations between words in a sentence. We propose a novel neural network architecture that is able to generate the most accurate discontinuous constituent representations.
arXiv Detail & Related papers (2020-02-05T15:12:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.