Mean BERTs make erratic language teachers: the effectiveness of latent
bootstrapping in low-resource settings
- URL: http://arxiv.org/abs/2310.19420v1
- Date: Mon, 30 Oct 2023 10:31:32 GMT
- Title: Mean BERTs make erratic language teachers: the effectiveness of latent
bootstrapping in low-resource settings
- Authors: David Samuel
- Abstract summary: latent bootstrapping is an alternative self-supervision technique for pretraining language models.
We conduct experiments to assess how effective this approach is for acquiring linguistic knowledge from limited resources.
- Score: 5.121744234312891
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores the use of latent bootstrapping, an alternative
self-supervision technique, for pretraining language models. Unlike the typical
practice of using self-supervision on discrete subwords, latent bootstrapping
leverages contextualized embeddings for a richer supervision signal. We conduct
experiments to assess how effective this approach is for acquiring linguistic
knowledge from limited resources. Specifically, our experiments are based on
the BabyLM shared task, which includes pretraining on two small curated corpora
and an evaluation on four linguistic benchmarks.
Related papers
- The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education [3.967610895056427]
This paper presents the first study that leverages Natural Language Processing (NLP) techniques to assess multiple high-inference instructional practices.
We confront two challenges inherent in NLP-based instructional analysis, including noisy and long input data and highly skewed distributions of human ratings.
arXiv Detail & Related papers (2024-04-03T04:15:29Z) - Little Giants: Exploring the Potential of Small LLMs as Evaluation
Metrics in Summarization in the Eval4NLP 2023 Shared Task [53.163534619649866]
This paper focuses on assessing the effectiveness of prompt-based techniques to empower Large Language Models to handle the task of quality estimation.
We conducted systematic experiments with various prompting techniques, including standard prompting, prompts informed by annotator instructions, and innovative chain-of-thought prompting.
Our work reveals that combining these approaches using a "small", open source model (orca_mini_v3_7B) yields competitive results.
arXiv Detail & Related papers (2023-11-01T17:44:35Z) - Unsupervised Alignment of Distributional Word Embeddings [0.0]
Cross-domain alignment play a key role in tasks ranging from machine translation to transfer learning.
We show that the proposed approach achieves good performance on the bilingual lexicon induction task across several language pairs.
arXiv Detail & Related papers (2022-03-09T16:39:06Z) - IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark.
IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages.
We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z) - Skill Induction and Planning with Latent Language [94.55783888325165]
We formulate a generative model of action sequences in which goals generate sequences of high-level subtask descriptions.
We describe how to train this model using primarily unannotated demonstrations by parsing demonstrations into sequences of named high-level subtasks.
In trained models, the space of natural language commands indexes a library of skills; agents can use these skills to plan by generating high-level instruction sequences tailored to novel goals.
arXiv Detail & Related papers (2021-10-04T15:36:32Z) - Knowledge-Rich BERT Embeddings for Readability Assessment [0.0]
We propose an alternative way of utilizing the information-rich embeddings of BERT models through a joint-learning method.
Results show that the proposed method outperforms classical approaches in readability assessment using English and Filipino datasets.
arXiv Detail & Related papers (2021-06-15T07:37:48Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Combining Self-Training and Self-Supervised Learning for Unsupervised
Disfluency Detection [80.68446022994492]
In this work, we explore the unsupervised learning paradigm which can potentially work with unlabeled text corpora.
Our model builds upon the recent work on Noisy Student Training, a semi-supervised learning approach that extends the idea of self-training.
arXiv Detail & Related papers (2020-10-29T05:29:26Z) - Building Low-Resource NER Models Using Non-Speaker Annotation [58.78968578460793]
Cross-lingual methods have had notable success in addressing these concerns.
We propose a complementary approach to building low-resource Named Entity Recognition (NER) models using non-speaker'' (NS) annotations.
We show that use of NS annotators produces results that are consistently on par or better than cross-lingual methods built on modern contextual representations.
arXiv Detail & Related papers (2020-06-17T03:24:38Z) - Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting
BERT [29.04485839262945]
We propose a parameter-free probing technique for analyzing pre-trained language models (e.g., BERT)
Our method does not require direct supervision from the probing tasks, nor do we introduce additional parameters to the probing process.
Our experiments on BERT show that syntactic trees recovered from BERT using our method are significantly better than linguistically-uninformed baselines.
arXiv Detail & Related papers (2020-04-30T14:02:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.