Related papers: Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency

Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency

URL: http://arxiv.org/abs/2305.20018v1
Date: Wed, 31 May 2023 16:47:20 GMT
Title: Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency
Authors: Maxwell Crouse, Ramon Astudillo, Tahira Naseem, Subhajit Chaudhury, Pavan Kapanipathi, Salim Roukos, Alexander Gray
Abstract summary: Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
Score: 71.42261918225773
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce Logical Offline Cycle Consistency Optimization (LOCCO), a scalable, semi-supervised method for training a neural semantic parser. Conceptually, LOCCO can be viewed as a form of self-learning where the semantic parser being trained is used to generate annotations for unlabeled text that are then used as new supervision. To increase the quality of annotations, our method utilizes a count-based prior over valid formal meaning representations and a cycle-consistency score produced by a neural text generation model as additional signals. Both the prior and semantic parser are updated in an alternate fashion from full passes over the training data, which can be seen as approximating the marginalization of latent structures through stochastic variational inference. The use of a count-based prior, frozen text generation model, and offline annotation process yields an approach with negligible complexity and latency increases as compared to conventional self-learning. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model. We demonstrate the utility of LOCCO on the well-known WebNLG benchmark where we obtain an improvement of 2 points against a self-learning parser under equivalent conditions, an improvement of 1.3 points against the previous state-of-the-art parser, and competitive text generation performance in terms of BLEU score.

Related papers

CAAD: Context-Aware Adaptive Decoding for Truthful Text Generation [31.469511576774252]
We propose a context-aware adaptive decoding method for large language models.<n>Our approach achieves a 2.8 percent average improvement on TruthfulQA.<n>Our model-agnostic, scalable, and efficient method requires only a single generation pass.
arXiv Detail & Related papers (2025-08-04T08:28:25Z)
On Eliciting Syntax from Language Models via Hashing [19.872554909401316]
Unsupervised parsing aims to infer syntactic structure from raw text. In this paper, we explore the possibility of leveraging this capability to deduce parsing trees from raw text. We show that our method is effective and efficient enough to acquire high-quality parsing trees from pre-trained language models at a low cost.
arXiv Detail & Related papers (2024-10-05T08:06:19Z)
Reconsidering Degeneration of Token Embeddings with Definitions for Encoder-based Pre-trained Language Models [20.107727903240065]
We propose DefinitionEMB to re-construct isotropically distributed and semantics-related token embeddings for encoder-based language models. Our experiments demonstrate the effectiveness of leveraging definitions from Wiktionary to re-construct such embeddings.
arXiv Detail & Related papers (2024-08-02T15:00:05Z)
Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels. By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data. The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z)
Contrastive and Consistency Learning for Neural Noisy-Channel Model in Spoken Language Understanding [1.07288078404291]
We propose a natural language understanding approach based on Automatic Speech Recognition (ASR) We improve a noisy-channel model to handle transcription inconsistencies caused by ASR errors. Experiments on four benchmark datasets show that Contrastive and Consistency Learning (CCL) outperforms existing methods.
arXiv Detail & Related papers (2024-05-23T23:10:23Z)
Bit Cipher -- A Simple yet Powerful Word Representation System that Integrates Efficiently with Language Models [4.807347156077897]
Bit-cipher is a word representation system that eliminates the need of backpropagation and hyper-efficient dimensionality reduction techniques. We perform probing experiments on part-of-speech (POS) tagging and named entity recognition (NER) to assess bit-cipher's competitiveness with classic embeddings. By replacing embedding layers with cipher embeddings, our experiments illustrate the notable efficiency of cipher in accelerating the training process and attaining better optima.
arXiv Detail & Related papers (2023-11-18T08:47:35Z)
Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction [61.16125290912494]
$textEVL_textGen$ is a framework designed for the pre-training of visually conditioned language generation models. We show that our approach accelerates the training of vision-language models by a factor of 5 without a noticeable impact on overall performance.
arXiv Detail & Related papers (2023-10-05T03:40:06Z)
Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training [102.14558233502514]
Masked prediction pre-training has seen remarkable progress in self-supervised learning (SSL) for speech recognition. We propose two supervision-guided codebook generation approaches to improve automatic speech recognition (ASR) performance.
arXiv Detail & Related papers (2022-06-21T06:08:30Z)
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining [59.169836983883656]
COCO-LM is a new self-supervised learning framework that pretrains Language Models by COrrecting challenging errors and COntrasting text sequences. COCO-LM employs an auxiliary language model to mask-and-predict tokens in original text sequences. Our analyses reveal that COCO-LM's advantages come from its challenging training signals, more contextualized token representations, and regularized sequence representations.
arXiv Detail & Related papers (2021-02-16T22:24:29Z)
Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition [41.92991390542083]
We present a simple, novel and competitive approach for phoneme-based neural transducer modeling. A phonetic context size of one is shown to be sufficient for the best performance. The overall performance of our best model is comparable to state-of-the-art (SOTA) results for the TED-LIUM Release 2 and Switchboard corpora.
arXiv Detail & Related papers (2020-10-30T16:53:29Z)
SLM: Learning a Discourse Language Representation with Sentence Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation. We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.