Rethinking Self-Supervision Objectives for Generalizable Coherence
Modeling
- URL: http://arxiv.org/abs/2110.07198v1
- Date: Thu, 14 Oct 2021 07:44:14 GMT
- Title: Rethinking Self-Supervision Objectives for Generalizable Coherence
Modeling
- Authors: Prathyusha Jwalapuram, Shafiq Joty and Xiang Lin
- Abstract summary: Coherence evaluation of machine generated text is one of the principal applications of coherence models that needs to be investigated.
We explore training data and self-supervision objectives that result in a model that generalizes well across tasks.
We show empirically that increasing the density of negative samples improves the basic model, and using a global negative queue further improves and stabilizes the model while training with hard negative samples.
- Score: 8.329870357145927
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although large-scale pre-trained neural models have shown impressive
performances in a variety of tasks, their ability to generate coherent text
that appropriately models discourse phenomena is harder to evaluate and less
understood. Given the claims of improved text generation quality across various
systems, we consider the coherence evaluation of machine generated text to be
one of the principal applications of coherence models that needs to be
investigated. We explore training data and self-supervision objectives that
result in a model that generalizes well across tasks and can be used
off-the-shelf to perform such evaluations. Prior work in neural coherence
modeling has primarily focused on devising new architectures, and trained the
model to distinguish coherent and incoherent text through pairwise
self-supervision on the permuted documents task. We instead use a basic model
architecture and show significant improvements over state of the art within the
same training regime. We then design a harder self-supervision objective by
increasing the ratio of negative samples within a contrastive learning setup,
and enhance the model further through automatic hard negative mining coupled
with a large global negative queue encoded by a momentum encoder. We show
empirically that increasing the density of negative samples improves the basic
model, and using a global negative queue further improves and stabilizes the
model while training with hard negative samples. We evaluate the coherence
model on task-independent test sets that resemble real-world use cases and show
significant improvements in coherence evaluations of downstream applications.
Related papers
- Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study [61.65123150513683]
multimodal foundation models, such as CLIP, produce state-of-the-art zero-shot results.
It is reported that these models close the robustness gap by matching the performance of supervised models trained on ImageNet.
We show that CLIP leads to a significant robustness drop compared to supervised ImageNet models on our benchmark.
arXiv Detail & Related papers (2024-03-15T17:33:49Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Benchmarking the Robustness of Instance Segmentation Models [3.1287804585804073]
This paper presents a comprehensive evaluation of instance segmentation models with respect to real-world image corruptions and out-of-domain image collections.
The out-of-domain image evaluation shows the generalization capability of models, an essential aspect of real-world applications.
Specifically, this benchmark study includes state-of-the-art network architectures, network backbones, normalization layers, models trained starting from scratch or ImageNet pretrained networks.
arXiv Detail & Related papers (2021-09-02T17:50:07Z) - Mean Embeddings with Test-Time Data Augmentation for Ensembling of
Representations [8.336315962271396]
We look at the ensembling of representations and propose mean embeddings with test-time augmentation (MeTTA)
MeTTA significantly boosts the quality of linear evaluation on ImageNet for both supervised and self-supervised models.
We believe that spreading the success of ensembles to inference higher-quality representations is the important step that will open many new applications of ensembling.
arXiv Detail & Related papers (2021-06-15T10:49:46Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - On the model-based stochastic value gradient for continuous
reinforcement learning [50.085645237597056]
We show that simple model-based agents can outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward.
Our findings suggest that model-based policy evaluation deserves closer attention.
arXiv Detail & Related papers (2020-08-28T17:58:29Z) - Rethinking Coherence Modeling: Synthetic vs. Downstream Tasks [15.044192886215887]
Coherence models are typically evaluated only on synthetic tasks, which may not be representative of their performance in downstream applications.
We conduct experiments on benchmarking well-known traditional and neural coherence models on synthetic sentence ordering tasks.
Our results demonstrate a weak correlation between the model performances in the synthetic tasks and the downstream applications.
arXiv Detail & Related papers (2020-04-30T08:00:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.