Related papers: The NiuTrans System for WNGT 2020 Efficiency Task

The NiuTrans System for WNGT 2020 Efficiency Task

URL: http://arxiv.org/abs/2109.08008v1
Date: Thu, 16 Sep 2021 14:32:01 GMT
Title: The NiuTrans System for WNGT 2020 Efficiency Task
Authors: Chi Hu, Bei Li, Ye Lin, Yinqiao Li, Yanyang Li, Chenglong Wang, Tong Xiao, Jingbo Zhu
Abstract summary: This paper describes the submissions of the NiuTrans Team to the WNGT 2020 Efficiency Shared Task. We focus on the efficient implementation of deep Transformer models using NiuTensor, a flexible toolkit for NLP tasks.
Score: 32.88733142090084
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper describes the submissions of the NiuTrans Team to the WNGT 2020 Efficiency Shared Task. We focus on the efficient implementation of deep Transformer models \cite{wang-etal-2019-learning, li-etal-2019-niutrans} using NiuTensor (https://github.com/NiuTrans/NiuTensor), a flexible toolkit for NLP tasks. We explored the combination of deep encoder and shallow decoder in Transformer models via model compression and knowledge distillation. The neural machine translation decoding also benefits from FP16 inference, attention caching, dynamic batching, and batch pruning. Our systems achieve promising results in both translation quality and efficiency, e.g., our fastest system can translate more than 40,000 tokens per second with an RTX 2080 Ti while maintaining 42.9 BLEU on \textit{newstest2018}. The code, models, and docker images are available at NiuTrans.NMT (https://github.com/NiuTrans/NiuTrans.NMT).

Related papers

Shallow Cross-Encoders for Low-Latency Retrieval [69.06104373460597]
Cross-Encoders based on large transformer models (such as BERT or T5) are computationally expensive and allow for scoring only a small number of documents within a reasonably small latency window. We show that weaker shallow transformer models (i.e., transformers with a limited number of layers) actually perform better than full-scale models when constrained to these practical low-latency settings.
arXiv Detail & Related papers (2024-03-29T15:07:21Z)
Optimizing Deep Transformers for Chinese-Thai Low-Resource Translation [9.294853905247383]
We study the use of deep Transformer translation model for the CCMT 2022 Chinese-Thai low-resource machine translation task. Considering that increasing the number of layers also increases the regularization on new model parameters, we adopt the highest performance setting but increase the depth of the Transformer to 24 layers. Our work obtains the SOTA performance in the Chinese-to-Thai translation in the constrained evaluation.
arXiv Detail & Related papers (2022-12-24T05:35:04Z)
GTrans: Grouping and Fusing Transformer Layers for Neural Machine Translation [107.2752114891855]
Transformer structure, stacked by a sequence of encoder and decoder network layers, achieves significant development in neural machine translation. We propose the Group-Transformer model (GTrans) that flexibly divides multi-layer representations of both encoder and decoder into different groups and then fuses these group features to generate target words.
arXiv Detail & Related papers (2022-07-29T04:10:36Z)
The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task [92.5087402621697]
This paper describes the submission of our end-to-end YiTrans speech translation system for the IWSLT 2022 offline task. The YiTrans system is built on large-scale pre-trained encoder-decoder models. Our final submissions rank first on English-German and English-Chinese end-to-end systems in terms of the automatic evaluation metric.
arXiv Detail & Related papers (2022-06-12T16:13:01Z)
The NiuTrans Machine Translation Systems for WMT21 [23.121382706331403]
This paper describes NiuTrans neural machine translation systems of the WMT 2021 news translation tasks. We made submissions to 9 language directions, including English$leftarrow$$$Chinese, Japanese, Russian, Icelandic$$ and English$rightarrow$Hausa tasks.
arXiv Detail & Related papers (2021-09-22T02:00:24Z)
The NiuTrans System for the WMT21 Efficiency Task [26.065244284992147]
This paper describes the NiuTrans system for the WMT21 translation efficiency task. Our system can translate 247,000 words per second on an NVIDIA A100, being 3$times$ faster than last year's system.
arXiv Detail & Related papers (2021-09-16T14:21:52Z)
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task [23.008938777422767]
This paper describes the submission of the NiuTrans end-to-end speech translation system for the IWSLT 2021 offline task. We use the Transformer-based model architecture and enhance it by Conformer, relative position encoding, and stacked acoustic and textual encoding. We achieve 33.84 BLEU points on the MuST-C En-De test set, which shows the enormous potential of the end-to-end model.
arXiv Detail & Related papers (2021-07-06T07:45:23Z)
TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation [9.266588373318688]
We study the problem of improving efficiency in modeling global contexts without losing localization ability for low-level details. TransFuse, a novel two-branch architecture is proposed, which combines Transformers and CNNs in a parallel style. With TransFuse, both global dependency and low-level spatial details can be efficiently captured in a much shallower manner.
arXiv Detail & Related papers (2021-02-16T08:09:45Z)
Glancing Transformer for Non-Autoregressive Neural Machine Translation [58.87258329683682]
We propose a method to learn word interdependency for single-pass parallel generation models. With only single-pass parallel decoding, GLAT is able to generate high-quality translation with 8-15 times speedup.
arXiv Detail & Related papers (2020-08-18T13:04:03Z)
Very Deep Transformers for Neural Machine Translation [100.51465892354234]
We show that it is feasible to build standard Transformer-based models with up to 60 encoder layers and 12 decoder layers. These deep models outperform their baseline 6-layer counterparts by as much as 2.5 BLEU.
arXiv Detail & Related papers (2020-08-18T07:14:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.