Fugu-MT 論文翻訳(概要): On the Pareto Front of Multilingual Neural Machine Translation

論文の概要: On the Pareto Front of Multilingual Neural Machine Translation

arxiv url: http://arxiv.org/abs/2304.03216v3
Date: Tue, 31 Oct 2023 15:58:53 GMT
ステータス: 翻訳完了
システム内更新日: 2023-11-02 03:18:48.689739
Title: On the Pareto Front of Multilingual Neural Machine Translation
Title（参考訳）: 多言語ニューラルマシン翻訳のパレートフロントについて
Authors: Liang Chen and Shuming Ma and Dongdong Zhang and Furu Wei and Baobao Chang
Abstract要約: 我々は、ニューラルネットワーク翻訳(MNMT)におけるサンプリング比によって、与えられた方向の性能がどう変化するかを検討する。我々は,MNMTにおけるユニークなパフォーマンストレードオフフロントを予測するために,ダブルパワー法を提案する。本実験では, トレーニング予算の1/5から1/2に過ぎず, 温度探索法や勾配操作法よりも優れた性能が得られた。
参考スコア（独自算出の注目度）: 123.94355117635293
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we study how the performance of a given direction changes with its sampling ratio in Multilingual Neural Machine Translation (MNMT). By training over 200 multilingual models with various model sizes, data sizes, and language directions, we find it interesting that the performance of certain translation direction does not always improve with the increase of its weight in the multi-task optimization objective. Accordingly, scalarization method leads to a multitask trade-off front that deviates from the traditional Pareto front when there exists data imbalance in the training corpus, which poses a great challenge to improve the overall performance of all directions. Based on our observations, we propose the Double Power Law to predict the unique performance trade-off front in MNMT, which is robust across various languages, data adequacy, and the number of tasks. Finally, we formulate the sample ratio selection problem in MNMT as an optimization problem based on the Double Power Law. In our experiments, it achieves better performance than temperature searching and gradient manipulation methods with only 1/5 to 1/2 of the total training budget. We release the code at https://github.com/pkunlp-icler/ParetoMNMT for reproduction.
Abstract（参考訳）: 本研究は,MNMT(Multilingual Neural Machine Translation)におけるサンプリング比によって,与えられた方向の性能がどう変化するかを検討する。様々なモデルサイズ、データサイズ、言語方向の200以上の多言語モデルをトレーニングすることで、特定の翻訳方向のパフォーマンスが、マルチタスク最適化目標の重み付けによって常に向上するとは限らないことが興味深い。したがって、スカラー化方法は、トレーニングコーパスにデータ不均衡がある場合、従来のパレートフロントから逸脱するマルチタスクトレードオフフロントにつながり、すべての方向の全体的なパフォーマンスを改善するための大きな課題となる。本研究は,MNMTにおけるユニークな性能トレードオフを予測するための二重電力法を提案し,各言語にまたがるロバスト性,データ妥当性,タスク数について検討した。最後に,ダブルパワー則に基づく最適化問題として,mnmtのサンプル比選択問題を定式化する。実験では, 総トレーニング予算の1/5から1/2程度で, 温度探索や勾配操作よりも優れた性能を実現する。コードはhttps://github.com/pkunlp-icler/paretomnmtで公開しています。

論文の概要: On the Pareto Front of Multilingual Neural Machine Translation

関連論文リスト