Related papers: Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings

Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings

URL: http://arxiv.org/abs/2310.19420v1
Date: Mon, 30 Oct 2023 10:31:32 GMT
Title: Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings
Authors: David Samuel
Abstract summary: latent bootstrapping is an alternative self-supervision technique for pretraining language models. We conduct experiments to assess how effective this approach is for acquiring linguistic knowledge from limited resources.
Score: 5.121744234312891
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper explores the use of latent bootstrapping, an alternative self-supervision technique, for pretraining language models. Unlike the typical practice of using self-supervision on discrete subwords, latent bootstrapping leverages contextualized embeddings for a richer supervision signal. We conduct experiments to assess how effective this approach is for acquiring linguistic knowledge from limited resources. Specifically, our experiments are based on the BabyLM shared task, which includes pretraining on two small curated corpora and an evaluation on four linguistic benchmarks.

Related papers

The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education [3.967610895056427]
This paper presents the first study that leverages Natural Language Processing (NLP) techniques to assess multiple high-inference instructional practices. We confront two challenges inherent in NLP-based instructional analysis, including noisy and long input data and highly skewed distributions of human ratings.
arXiv Detail & Related papers (2024-04-03T04:15:29Z)
Little Giants: Exploring the Potential of Small LLMs as Evaluation Metrics in Summarization in the Eval4NLP 2023 Shared Task [53.163534619649866]
This paper focuses on assessing the effectiveness of prompt-based techniques to empower Large Language Models to handle the task of quality estimation. We conducted systematic experiments with various prompting techniques, including standard prompting, prompts informed by annotator instructions, and innovative chain-of-thought prompting. Our work reveals that combining these approaches using a "small", open source model (orca_mini_v3_7B) yields competitive results.
arXiv Detail & Related papers (2023-11-01T17:44:35Z)
Unsupervised Alignment of Distributional Word Embeddings [0.0]
Cross-domain alignment play a key role in tasks ranging from machine translation to transfer learning. We show that the proposed approach achieves good performance on the bilingual lexicon induction task across several language pairs.
arXiv Detail & Related papers (2022-03-09T16:39:06Z)
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark. IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z)
Skill Induction and Planning with Latent Language [94.55783888325165]
We formulate a generative model of action sequences in which goals generate sequences of high-level subtask descriptions. We describe how to train this model using primarily unannotated demonstrations by parsing demonstrations into sequences of named high-level subtasks. In trained models, the space of natural language commands indexes a library of skills; agents can use these skills to plan by generating high-level instruction sequences tailored to novel goals.
arXiv Detail & Related papers (2021-10-04T15:36:32Z)
Knowledge-Rich BERT Embeddings for Readability Assessment [0.0]
We propose an alternative way of utilizing the information-rich embeddings of BERT models through a joint-learning method. Results show that the proposed method outperforms classical approaches in readability assessment using English and Filipino datasets.
arXiv Detail & Related papers (2021-06-15T07:37:48Z)
SLM: Learning a Discourse Language Representation with Sentence Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation. We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection [80.68446022994492]
In this work, we explore the unsupervised learning paradigm which can potentially work with unlabeled text corpora. Our model builds upon the recent work on Noisy Student Training, a semi-supervised learning approach that extends the idea of self-training.
arXiv Detail & Related papers (2020-10-29T05:29:26Z)
Building Low-Resource NER Models Using Non-Speaker Annotation [58.78968578460793]
Cross-lingual methods have had notable success in addressing these concerns. We propose a complementary approach to building low-resource Named Entity Recognition (NER) models using non-speaker'' (NS) annotations. We show that use of NS annotators produces results that are consistently on par or better than cross-lingual methods built on modern contextual representations.
arXiv Detail & Related papers (2020-06-17T03:24:38Z)
Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT [29.04485839262945]
We propose a parameter-free probing technique for analyzing pre-trained language models (e.g., BERT) Our method does not require direct supervision from the probing tasks, nor do we introduce additional parameters to the probing process. Our experiments on BERT show that syntactic trees recovered from BERT using our method are significantly better than linguistically-uninformed baselines.
arXiv Detail & Related papers (2020-04-30T14:02:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.