Training BERT Models to Carry Over a Coding System Developed on One Corpus to Another
- URL: http://arxiv.org/abs/2308.03742v2
- Date: Tue, 26 Mar 2024 16:03:57 GMT
- Title: Training BERT Models to Carry Over a Coding System Developed on One Corpus to Another
- Authors: Dalma Galambos, Pál Zsámboki,
- Abstract summary: This paper describes how we train BERT models to carry over a coding system developed on the paragraphs of a Hungarian literary journal to another.
The aim of the coding system is to track trends in the perception of literary translation around the political transformation in 1989 in Hungary.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper describes how we train BERT models to carry over a coding system developed on the paragraphs of a Hungarian literary journal to another. The aim of the coding system is to track trends in the perception of literary translation around the political transformation in 1989 in Hungary. To evaluate not only task performance but also the consistence of the annotation, moreover, to get better predictions from an ensemble, we use 10-fold crossvalidation. Extensive hyperparameter tuning is used to obtain the best possible results and fair comparisons. To handle label imbalance, we use loss functions and metrics robust to it. Evaluation of the effect of domain shift is carried out by sampling a test set from the target domain. We establish the sample size by estimating the bootstrapped confidence interval via simulations. This way, we show that our models can carry over one annotation system to the target domain. Comparisons are drawn to provide insights such as learning multilabel correlations and confidence penalty improve resistance to domain shift, and domain adaptation on OCR-ed text on another domain improves performance almost to the same extent as that on the corpus under study. See our code at https://codeberg.org/zsamboki/bert-annotator-ensemble.
Related papers
- DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for
Domain Adaptation [20.697905456202754]
We propose a new self-training framework for domain adaptation, namely Domain adversarial learning enhanced Self-Training Framework (DaMSTF)
DaMSTF involves meta-learning to estimate the importance of each pseudo instance, so as to simultaneously reduce the label noise and preserve hard examples.
DaMSTF improves the performance of BERT with an average of nearly 4%.
arXiv Detail & Related papers (2023-08-05T00:14:49Z) - Bridging the Domain Gaps in Context Representations for k-Nearest
Neighbor Neural Machine Translation [57.49095610777317]
$k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains.
We propose a novel approach to boost the datastore retrieval of $k$NN-MT by reconstructing the original datastore.
Our method can effectively boost the datastore retrieval and translation quality of $k$NN-MT.
arXiv Detail & Related papers (2023-05-26T03:04:42Z) - Domain-knowledge Inspired Pseudo Supervision (DIPS) for Unsupervised
Image-to-Image Translation Models to Support Cross-Domain Classification [16.4151067682813]
This paper introduces a new method called Domain-knowledge Inspired Pseudo Supervision (DIPS)
DIPS uses domain-informed Gaussian Mixture Models to generate pseudo annotations to enable the use of traditional supervised metrics.
It proves its effectiveness by outperforming various GAN evaluation metrics, including FID, when selecting the optimal saved checkpoint model.
arXiv Detail & Related papers (2023-03-18T02:42:18Z) - Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be
Consistent [97.64313409741614]
We propose to enforce a emphconsistency property which states that predictions of the model on its own generated data are consistent across time.
We show that our novel training objective yields state-of-the-art results for conditional and unconditional generation in CIFAR-10 and baseline improvements in AFHQ and FFHQ.
arXiv Detail & Related papers (2023-02-17T18:45:04Z) - Okapi: Generalising Better by Making Statistical Matches Match [7.392460712829188]
Okapi is a simple, efficient, and general method for robust semi-supervised learning based on online statistical matching.
Our method uses a nearest-neighbours-based matching procedure to generate cross-domain views for a consistency loss.
We show that it is in fact possible to leverage additional unlabelled data to improve upon empirical risk minimisation.
arXiv Detail & Related papers (2022-11-07T12:41:17Z) - Adapting the Mean Teacher for keypoint-based lung registration under
geometric domain shifts [75.51482952586773]
deep neural networks generally require plenty of labeled training data and are vulnerable to domain shifts between training and test data.
We present a novel approach to geometric domain adaptation for image registration, adapting a model from a labeled source to an unlabeled target domain.
Our method consistently improves on the baseline model by 50%/47% while even matching the accuracy of models trained on target data.
arXiv Detail & Related papers (2022-07-01T12:16:42Z) - Non-Parametric Domain Adaptation for End-to-End Speech Translation [72.37869362559212]
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.
We propose a novel non-parametric method that leverages domain-specific text translation corpus to achieve domain adaptation for the E2E-ST system.
arXiv Detail & Related papers (2022-05-23T11:41:02Z) - Low-confidence Samples Matter for Domain Adaptation [47.552605279925736]
Domain adaptation (DA) aims to transfer knowledge from a label-rich source domain to a related but label-scarce target domain.
We propose a novel contrastive learning method by processing low-confidence samples.
We evaluate the proposed method in both unsupervised and semi-supervised DA settings.
arXiv Detail & Related papers (2022-02-06T15:45:45Z) - Multiple-Source Domain Adaptation via Coordinated Domain Encoders and
Paired Classifiers [1.52292571922932]
We present a novel model for text classification under domain shift.
It exploits the update representations to dynamically integrate domain encoders.
It also employs a probabilistic model to infer the error rate in the target domain.
arXiv Detail & Related papers (2022-01-28T00:50:01Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.