Aligning a medium-size GPT model in English to a small closed domain in
Spanish
- URL: http://arxiv.org/abs/2303.17649v3
- Date: Tue, 30 May 2023 21:48:49 GMT
- Title: Aligning a medium-size GPT model in English to a small closed domain in
Spanish
- Authors: Oscar R. Navarrete-Parra, Victor Uc-Cetina, Jorge Reyes-Magana
- Abstract summary: We propose a methodology to align a medium-sized GPT model, originally trained in English for an open domain, to a small closed domain in Spanish.
We also needed to train and implement another neural network that could score and determine whether an answer is appropriate for a given question.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a methodology to align a medium-sized GPT model,
originally trained in English for an open domain, to a small closed domain in
Spanish. The application for which the model is finely tuned is the question
answering task. To achieve this we also needed to train and implement another
neural network (which we called the reward model) that could score and
determine whether an answer is appropriate for a given question. This component
served to improve the decoding and generation of the answers of the system.
Numerical metrics such as BLEU and perplexity were used to evaluate the model,
and human judgment was also used to compare the decoding technique with others.
Finally, the results favored the proposed method, and it was determined that it
is feasible to use a reward model to align the generation of responses.
Related papers
- Optimal Design for Reward Modeling in RLHF [83.3614658277817]
We formalize the reward training model in Reinforcement Learning from Human Feedback.
We frame the selection of an effective dataset as a simple regret minimization task.
We derive bounds on the simple regret under appropriate assumptions.
arXiv Detail & Related papers (2024-10-22T14:36:44Z) - CANTONMT: Investigating Back-Translation and Model-Switch Mechanisms for Cantonese-English Neural Machine Translation [9.244878233604819]
This paper investigates the development and evaluation of machine translation models from Cantonese to English.
A new parallel corpus has been created by combining different available corpora online with preprocessing and cleaning.
A monolingual Cantonese dataset has been created through web scraping to aid the synthetic parallel corpus generation.
arXiv Detail & Related papers (2024-05-13T20:37:04Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - A Simple Baseline for Beam Search Reranking [42.416019490068614]
We examine a simple approach for training rerankers to predict translation candidates' BLEU scores without introducing additional data or parameters.
Our approach can be used as a clean baseline, decoupled from external factors, for future research in this area.
arXiv Detail & Related papers (2022-12-17T18:22:20Z) - Exploring validation metrics for offline model-based optimisation with
diffusion models [50.404829846182764]
In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle.
While an approximation to the ground oracle can be trained and used in place of it during model validation to measure the mean reward over generated candidates, the evaluation is approximate and vulnerable to adversarial examples.
This is encapsulated under our proposed evaluation framework which is also designed to measure extrapolation.
arXiv Detail & Related papers (2022-11-19T16:57:37Z) - Adapting the Mean Teacher for keypoint-based lung registration under
geometric domain shifts [75.51482952586773]
deep neural networks generally require plenty of labeled training data and are vulnerable to domain shifts between training and test data.
We present a novel approach to geometric domain adaptation for image registration, adapting a model from a labeled source to an unlabeled target domain.
Our method consistently improves on the baseline model by 50%/47% while even matching the accuracy of models trained on target data.
arXiv Detail & Related papers (2022-07-01T12:16:42Z) - Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network [0.0]
We show that RNN-transducer models can be effectively adapted to new domains using only small amounts of textual data.
We show with multiple ASR evaluation tasks how this method can provide relative gains of 10-45% in target task WER.
arXiv Detail & Related papers (2021-04-22T15:21:41Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Model family selection for classification using Neural Decision Trees [4.286327408435937]
In this paper we propose a method to reduce the scope of exploration needed for the task.
The idea is to quantify how much it would be necessary to depart from trained instances of a given family, reference models (RMs) carrying rigid' decision boundaries.
arXiv Detail & Related papers (2020-06-20T01:27:01Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.