Single Model Ensemble for Subword Regularized Models in Low-Resource
Machine Translation
- URL: http://arxiv.org/abs/2203.13528v1
- Date: Fri, 25 Mar 2022 09:25:47 GMT
- Title: Single Model Ensemble for Subword Regularized Models in Low-Resource
Machine Translation
- Authors: Sho Takase, Tatsuya Hiraoka, Naoaki Okazaki
- Abstract summary: Subword regularizations use multiple subword segmentations during training to improve the robustness of neural machine translation models.
We propose an inference strategy to address this discrepancy.
Experimental results show that the proposed strategy improves the performance of models trained with subword regularization.
- Score: 25.04086897886412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Subword regularizations use multiple subword segmentations during training to
improve the robustness of neural machine translation models. In previous
subword regularizations, we use multiple segmentations in the training process
but use only one segmentation in the inference. In this study, we propose an
inference strategy to address this discrepancy. The proposed strategy
approximates the marginalized likelihood by using multiple segmentations
including the most plausible segmentation and several sampled segmentations.
Because the proposed strategy aggregates predictions from several
segmentations, we can regard it as a single model ensemble that does not
require any additional cost for training. Experimental results show that the
proposed strategy improves the performance of models trained with subword
regularization in low-resource machine translation tasks.
Related papers
- Universal Segmentation at Arbitrary Granularity with Language
Instruction [59.76130089644841]
We present UniLSeg, a universal segmentation model that can perform segmentation at any semantic level with the guidance of language instructions.
For training UniLSeg, we reorganize a group of tasks from original diverse distributions into a unified data format, where images with texts describing segmentation targets as input and corresponding masks are output.
arXiv Detail & Related papers (2023-12-04T04:47:48Z) - SelfSeg: A Self-supervised Sub-word Segmentation Method for Neural
Machine Translation [51.881877192924414]
Sub-word segmentation is an essential pre-processing step for Neural Machine Translation (NMT)
This paper introduces SelfSeg, a self-supervised neural sub-word segmentation method.
SelfSeg is much faster to train/decode and requires only monolingual dictionaries instead of parallel corpora.
arXiv Detail & Related papers (2023-07-31T04:38:47Z) - Diffusion Models for Open-Vocabulary Segmentation [79.02153797465324]
OVDiff is a novel method that leverages generative text-to-image diffusion models for unsupervised open-vocabulary segmentation.
It relies solely on pre-trained components and outputs the synthesised segmenter directly, without training.
arXiv Detail & Related papers (2023-06-15T17:51:28Z) - Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier.
Our method is model-agnostic and can be easily applied to generic segmentation models.
With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z) - Meta-learning Pathologies from Radiology Reports using Variance Aware
Prototypical Networks [3.464871689508835]
We propose a simple extension of the Prototypical Networks for few-shot text classification.
Our main idea is to replace the class prototypes by Gaussians and introduce a regularization term that encourages the examples to be clustered near the appropriate class centroids.
arXiv Detail & Related papers (2022-10-22T05:22:29Z) - Subword Segmental Language Modelling for Nguni Languages [7.252933737829635]
Subword segmental language model (SSLM) learns how to segment words while being trained for autoregressive language modelling.
We train our model on the 4 Nguni languages of South Africa.
Our results show that learning subword segmentation is an effective alternative to existing subword segmenters.
arXiv Detail & Related papers (2022-10-12T18:41:00Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - The Effectiveness of Morphology-aware Segmentation in Low-Resource
Neural Machine Translation [0.6091702876917281]
This paper evaluates the performance of several modern subword segmentation methods in a low-resource neural machine translation setting.
We compare segmentations produced by applying BPE at the token or sentence level with morphologically-based segmentations from LMVR and MORSEL.
arXiv Detail & Related papers (2021-03-20T14:39:25Z) - Multi-view Subword Regularization [111.04350390045705]
Multi-view Subword Regularization (MVR) is a method that enforces the consistency between predictions of using inputs tokenized by the standard and probabilistic segmentations.
Results on the XTREME multilingual benchmark show that MVR brings consistent improvements of up to 2.5 points over using standard segmentation algorithms.
arXiv Detail & Related papers (2021-03-15T16:07:42Z) - Adversarial Subword Regularization for Robust Neural Machine Translation [23.968624881678913]
Exposing diverse subword segmentations to neural machine translation (NMT) models often improves the robustness of machine translation.
We present adversarial subword regularization (ADVSR) to study whether gradient signals during training can be a substitute criterion for exposing diverse subword segmentations.
arXiv Detail & Related papers (2020-04-29T12:06:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.