Does the Objective Matter? Comparing Training Objectives for Pronoun
Resolution
- URL: http://arxiv.org/abs/2010.02570v1
- Date: Tue, 6 Oct 2020 09:29:51 GMT
- Title: Does the Objective Matter? Comparing Training Objectives for Pronoun
Resolution
- Authors: Yordan Yordanov, Oana-Maria Camburu, Vid Kocijan, Thomas Lukasiewicz
- Abstract summary: We make a comparison of the performance and seed-wise stability of four models that represent four categories of objectives.
Our experiments show that the objective of sequence ranking performs the best in-domain, while the objective of semantic similarity between candidates and pronoun performs the best out-of-domain.
- Score: 52.94024891473669
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hard cases of pronoun resolution have been used as a long-standing benchmark
for commonsense reasoning. In the recent literature, pre-trained language
models have been used to obtain state-of-the-art results on pronoun resolution.
Overall, four categories of training and evaluation objectives have been
introduced. The variety of training datasets and pre-trained language models
used in these works makes it unclear whether the choice of training objective
is critical. In this work, we make a fair comparison of the performance and
seed-wise stability of four models that represent the four categories of
objectives. Our experiments show that the objective of sequence ranking
performs the best in-domain, while the objective of semantic similarity between
candidates and pronoun performs the best out-of-domain. We also observe a
seed-wise instability of the model using sequence ranking, which is not the
case when the other objectives are used.
Related papers
- Integrating Self-supervised Speech Model with Pseudo Word-level Targets
from Visually-grounded Speech Model [57.78191634042409]
We propose Pseudo-Word HuBERT (PW-HuBERT), a framework that integrates pseudo word-level targets into the training process.
Our experimental results on four spoken language understanding (SLU) benchmarks suggest the superiority of our model in capturing semantic information.
arXiv Detail & Related papers (2024-02-08T16:55:21Z) - Forging Multiple Training Objectives for Pre-trained Language Models via
Meta-Learning [97.28779163988833]
Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling.
We propose textitMOMETAS, a novel adaptive sampler based on meta-learning, which learns the latent sampling pattern on arbitrary pre-training objectives.
arXiv Detail & Related papers (2022-10-19T04:38:26Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - A General Language Assistant as a Laboratory for Alignment [3.3598752405752106]
We study simple baseline techniques and evaluations, such as prompting.
We find that the benefits from modest interventions increase with model size, generalize to a variety of alignment evaluations, and do not compromise the performance of large models.
We study a preference model pre-training' stage of training, with the goal of improving sample efficiency when finetuning on human preferences.
arXiv Detail & Related papers (2021-12-01T22:24:34Z) - A Brief Study on the Effects of Training Generative Dialogue Models with
a Semantic loss [37.8626106992769]
We study the effects of minimizing an alternate training objective that fosters a model to generate alternate response and score it on semantic similarity.
We explore this idea on two different sized data sets on the task of next utterance generation in goal oriented dialogues.
arXiv Detail & Related papers (2021-06-20T04:39:29Z) - Discrete representations in neural models of spoken language [56.29049879393466]
We compare the merits of four commonly used metrics in the context of weakly supervised models of spoken language.
We find that the different evaluation metrics can give inconsistent results.
arXiv Detail & Related papers (2021-05-12T11:02:02Z) - Evaluating Text Coherence at Sentence and Paragraph Levels [17.99797111176988]
We investigate the adaptation of existing sentence ordering methods to a paragraph ordering task.
We also compare the learnability and robustness of existing models by artificially creating mini datasets and noisy datasets.
We conclude that the recurrent graph neural network-based model is an optimal choice for coherence modeling.
arXiv Detail & Related papers (2020-06-05T03:31:49Z) - Better Captioning with Sequence-Level Exploration [60.57850194028581]
We show the limitation of the current sequence-level learning objective for captioning tasks.
In theory, we show that the current objective is equivalent to only optimizing the precision side of the caption set.
Empirical result shows that the model trained by this objective tends to get lower score on the recall side.
arXiv Detail & Related papers (2020-03-08T09:08:03Z) - VSEC-LDA: Boosting Topic Modeling with Embedded Vocabulary Selection [20.921010767231923]
We propose a new approach to topic modeling, termed Vocabulary-Selection-Embedded Correspondence-LDA (VSEC-LDA)
VSEC-LDA learns the latent model while simultaneously selecting most relevant words.
The selection of words is driven by an entropy-based metric that measures the relative contribution of the words to the underlying model.
arXiv Detail & Related papers (2020-01-15T22:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.