Modeling Voting for System Combination in Machine Translation
- URL: http://arxiv.org/abs/2007.06943v1
- Date: Tue, 14 Jul 2020 09:59:38 GMT
- Title: Modeling Voting for System Combination in Machine Translation
- Authors: Xuancheng Huang, Jiacheng Zhang, Zhixing Tan, Derek F. Wong, Huanbo
Luan, Jingfang Xu, Maosong Sun, Yang Liu
- Abstract summary: We propose an approach to modeling voting for system combination in machine translation.
Our approach combines the advantages of statistical and neural methods since it can not only analyze the relations between hypotheses but also allow for end-to-end training.
- Score: 92.09572642019145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: System combination is an important technique for combining the hypotheses of
different machine translation systems to improve translation performance.
Although early statistical approaches to system combination have been proven
effective in analyzing the consensus between hypotheses, they suffer from the
error propagation problem due to the use of pipelines. While this problem has
been alleviated by end-to-end training of multi-source sequence-to-sequence
models recently, these neural models do not explicitly analyze the relations
between hypotheses and fail to capture their agreement because the attention to
a word in a hypothesis is calculated independently, ignoring the fact that the
word might occur in multiple hypotheses. In this work, we propose an approach
to modeling voting for system combination in machine translation. The basic
idea is to enable words in hypotheses from different systems to vote on words
that are representative and should get involved in the generation process. This
can be done by quantifying the influence of each voter and its preference for
each candidate. Our approach combines the advantages of statistical and neural
methods since it can not only analyze the relations between hypotheses but also
allow for end-to-end training. Experiments show that our approach is capable of
better taking advantage of the consensus between hypotheses and achieves
significant improvements over state-of-the-art baselines on Chinese-English and
English-German machine translation tasks.
Related papers
- Word-wise intonation model for cross-language TTS systems [0.0]
The proposed model is suitable for automatic data markup and its extended application to text-to-speech systems.
The key idea is a partial elimination of the variability connected with different placements of a stressed syllable in a word.
The proposed model could be used as a tool for intonation research or as a backbone for prosody description in text-to-speech systems.
arXiv Detail & Related papers (2024-09-30T15:09:42Z) - Improving Language Models Meaning Understanding and Consistency by
Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness.
A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results.
We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Token-wise Decomposition of Autoregressive Language Model Hidden States
for Analyzing Model Predictions [9.909170013118775]
This work presents a linear decomposition of final hidden states from autoregressive language models based on each initial input token.
Using the change in next-word probability as a measure of importance, this work first examines which context words make the biggest contribution to language model predictions.
arXiv Detail & Related papers (2023-05-17T23:55:32Z) - An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented
Dialogue Generation [21.106357884651363]
We introduce neuro-symbolic to perform explicit reasoning that justifies model decisions by reasoning chains.
We propose a two-phase approach that consists of a hypothesis generator and a reasoner.
The whole system is trained by exploiting raw textual dialogues without using any reasoning chain annotations.
arXiv Detail & Related papers (2022-03-11T10:44:08Z) - Interactive Model with Structural Loss for Language-based Abductive
Reasoning [36.02450824915494]
The abductive natural language inference task ($alpha$NLI) is proposed to infer the most plausible explanation between the cause and the event.
We name this new model for $alpha$NLI: Interactive Model with Structural Loss (IMSL)
Our IMSL has achieved the highest performance on the RoBERTa-large pretrained model, with ACC and AUC results increased by about 1% and 5% respectively.
arXiv Detail & Related papers (2021-12-01T05:21:07Z) - On Sampling-Based Training Criteria for Neural Language Modeling [97.35284042981675]
We consider Monte Carlo sampling, importance sampling, a novel method we call compensated partial summation, and noise contrastive estimation.
We show that all these sampling methods can perform equally well, as long as we correct for the intended class posterior probabilities.
Experimental results in language modeling and automatic speech recognition on Switchboard and LibriSpeech support our claim.
arXiv Detail & Related papers (2021-04-21T12:55:52Z) - Do all Roads Lead to Rome? Understanding the Role of Initialization in
Iterative Back-Translation [48.26374127723598]
Back-translation is an approach to exploit monolingual corpora in Neural Machine Translation (NMT)
In this paper, we analyze the role that pre-training plays in iterative back-translation.
We show that, although the quality of the initial system does affect final performance, its effect is relatively small.
arXiv Detail & Related papers (2020-02-28T17:05:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.