Related papers: Transfer of Structural Knowledge from Synthetic Languages

Transfer of Structural Knowledge from Synthetic Languages

URL: http://arxiv.org/abs/2505.15769v1
Date: Wed, 21 May 2025 17:18:51 GMT
Title: Transfer of Structural Knowledge from Synthetic Languages
Authors: Mikhail Budnikov, Ivan Yamshchikov,
Abstract summary: This work explores transfer learning from several synthetic languages to English.<n>We introduce a new synthetic language that leads to better transfer to English than the languages used in previous research.<n>We use Tiny-Cloze Benchmark to evaluate fine-tuned models in several domains demonstrating that fine-tuning on a new synthetic language allows for better performance on a variety of tasks.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work explores transfer learning from several synthetic languages to English. We investigate the structure of the embeddings in the fine-tuned models, the information they contain, and the capabilities of the fine-tuned models on simple linguistic tasks. We also introduce a new synthetic language that leads to better transfer to English than the languages used in previous research. Finally, we introduce Tiny-Cloze Benchmark - a new synthetic benchmark for natural language understanding that is more informative for less powerful models. We use Tiny-Cloze Benchmark to evaluate fine-tuned models in several domains demonstrating that fine-tuning on a new synthetic language allows for better performance on a variety of tasks.

Related papers

Parameterized Synthetic Text Generation with SimpleStories [0.9503060186757711]
We present a large synthetic story dataset in simple language, consisting of 2 million samples each in English and Japanese.<n>Through parameterizing prompts at multiple levels of abstraction, we achieve control over story characteristics at scale.<n>We open-source all constituent parts of model creation, hoping to enable novel ways to study the end-to-end training process.
arXiv Detail & Related papers (2025-04-12T11:44:47Z)
Infusing Prompts with Syntax and Semantics [0.0]
We analyze the effect of directly infusing various kinds of syntactic and semantic information into large language models.<n>We show that linguistic analysis can significantly boost language models, to the point that we have surpassed previous best systems.
arXiv Detail & Related papers (2024-12-08T23:49:38Z)
Learning Phonotactics from Linguistic Informants [54.086544221761486]
Our model iteratively selects or synthesizes a data-point according to one of a range of information-theoretic policies. We find that the information-theoretic policies that our model uses to select items to query the informant achieve sample efficiency comparable to, or greater than, fully supervised approaches.
arXiv Detail & Related papers (2024-05-08T00:18:56Z)
BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing [55.058258437125524]
We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing. We benchmark eight language models, including two GPT-3 variants available only through an API. Our experiments show that encoder-decoder pretrained language models can achieve similar performance or surpass state-of-the-art methods for syntactic and semantic parsing when the model output is constrained to be valid.
arXiv Detail & Related papers (2022-06-21T18:34:11Z)
An Application of Pseudo-Log-Likelihoods to Natural Language Scoring [5.382454613390483]
A language model with relatively few parameters and training steps can outperform it on a recent large data set. We produce some absolute state-of-the-art results for common sense reasoning in binary choice tasks. We argue that robustness of the smaller model ought to be understood in terms of compositionality.
arXiv Detail & Related papers (2022-01-23T22:00:54Z)
Continual Learning in Multilingual NMT via Language-Specific Embeddings [92.91823064720232]
It consists in replacing the shared vocabulary with a small language-specific vocabulary and fine-tuning the new embeddings on the new language's parallel data. Because the parameters of the original model are not modified, its performance on the initial languages does not degrade.
arXiv Detail & Related papers (2021-10-20T10:38:57Z)
Syntactic Persistence in Language Models: Priming as a Window into Abstract Language Representations [0.38498574327875945]
We investigate the extent to which modern, neural language models are susceptible to syntactic priming. We introduce a novel metric and release Prime-LM, a large corpus where we control for various linguistic factors which interact with priming strength. We report surprisingly strong priming effects when priming with multiple sentences, each with different words and meaning but with identical syntactic structure.
arXiv Detail & Related papers (2021-09-30T10:38:38Z)
Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages. We infer this distribution from a sample of typologically diverse training languages. We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z)
Constrained Language Models Yield Few-Shot Semantic Parsers [73.50960967598654]
We explore the use of large pretrained language models as few-shot semantics. The goal in semantic parsing is to generate a structured meaning representation given a natural language input. We use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation.
arXiv Detail & Related papers (2021-04-18T08:13:06Z)
Comparison of Interactive Knowledge Base Spelling Correction Models for Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict. This work shows a comparison of a neural model and character language models with varying amounts on target language data. Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z)
Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models [27.91397366776451]
Training LSTMs on latent structure (MIDI music or Java code) improves test performance on natural language. Experiments on transfer between natural languages controlling for vocabulary overlap show that zero-shot performance on a test language is highly correlated with typological similarity to the training language.
arXiv Detail & Related papers (2020-04-30T06:24:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.