ZEN 2.0: Continue Training and Adaption for N-gram Enhanced Text
Encoders
- URL: http://arxiv.org/abs/2105.01279v1
- Date: Tue, 4 May 2021 04:08:58 GMT
- Title: ZEN 2.0: Continue Training and Adaption for N-gram Enhanced Text
Encoders
- Authors: Yan Song, Tong Zhang, Yonggang Wang, Kai-Fu Lee
- Abstract summary: We propose to pre-train n-gram-enhanced encoders with a large volume of data and advanced techniques for training.
New state-of-the-art performance is observed from a long list of NLP tasks across languages and domains.
- Score: 32.53471313532653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained text encoders have drawn sustaining attention in natural language
processing (NLP) and shown their capability in obtaining promising results in
different tasks. Recent studies illustrated that external self-supervised
signals (or knowledge extracted by unsupervised learning, such as n-grams) are
beneficial to provide useful semantic evidence for understanding languages such
as Chinese, so as to improve the performance on various downstream tasks
accordingly. To further enhance the encoders, in this paper, we propose to
pre-train n-gram-enhanced encoders with a large volume of data and advanced
techniques for training. Moreover, we try to extend the encoder to different
languages as well as different domains, where it is confirmed that the same
architecture is applicable to these varying circumstances and new
state-of-the-art performance is observed from a long list of NLP tasks across
languages and domains.
Related papers
- TG-LLaVA: Text Guided LLaVA via Learnable Latent Embeddings [61.9257731511557]
We propose Text Guided LLaVA (TG-LLaVA) to optimize vision-language models (VLMs)
We use learnable latent embeddings as a bridge to analyze textual instruction and add the analysis results to the vision encoder as guidance.
With the guidance of text, the vision encoder can extract text-related features, similar to how humans focus on the most relevant parts of an image when considering a question.
arXiv Detail & Related papers (2024-09-15T00:38:34Z) - Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features [18.76505158652759]
We propose to exploit both semantic and linguistic features between multiple languages to enhance multilingual translation.
On the encoder side, we introduce a disentangling learning task that aligns encoder representations by disentangling semantic and linguistic features.
On the decoder side, we leverage a linguistic encoder to integrate low-level linguistic features to assist in the target language generation.
arXiv Detail & Related papers (2024-08-02T17:10:12Z) - T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text [59.57676466961787]
We propose a novel dynamic vector quantization (DVA-VAE) model that can adjust the encoding length based on the information density in sign language.
Experiments conducted on the PHOENIX14T dataset demonstrate the effectiveness of our proposed method.
We propose a new large German sign language dataset, PHOENIX-News, which contains 486 hours of sign language videos, audio, and transcription texts.
arXiv Detail & Related papers (2024-06-11T10:06:53Z) - XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems
to Improve Language Understanding [73.24847320536813]
This study explores distilling visual information from pretrained multimodal transformers to pretrained language encoders.
Our framework is inspired by cross-modal encoders' success in visual-language tasks while we alter the learning objective to cater to the language-heavy characteristics of NLU.
arXiv Detail & Related papers (2022-04-15T03:44:00Z) - Explore More Guidance: A Task-aware Instruction Network for Sign
Language Translation Enhanced with Data Augmentation [20.125265661134964]
Sign language recognition and translation first uses a recognition module to generate glosses from sign language videos.
In this work, we propose a task-aware instruction network, namely TIN-SLT, for sign language translation.
arXiv Detail & Related papers (2022-04-12T17:09:44Z) - Linguistic Knowledge in Data Augmentation for Natural Language
Processing: An Example on Chinese Question Matching [0.0]
Two DA programs produce augmented texts by five simple text editing operations.
One is enhanced with a n-gram language model to make it fused with extra linguistic knowledge.
Models trained on both types of the augmented trained sets were found to be outperformed by those directly trained on the associated un-augmented train sets.
arXiv Detail & Related papers (2021-11-29T17:07:49Z) - Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in
Multitask End-to-End Speech Translation [127.54315184545796]
Speech translation (ST) aims to learn transformations from speech in the source language to the text in the target language.
We propose to improve the multitask ST model by utilizing word embedding as the intermediate.
arXiv Detail & Related papers (2020-05-21T14:22:35Z) - Bi-Decoder Augmented Network for Neural Machine Translation [108.3931242633331]
We propose a novel Bi-Decoder Augmented Network (BiDAN) for the neural machine translation task.
Since each decoder transforms the representations of the input text into its corresponding language, jointly training with two target ends can make the shared encoder has the potential to produce a language-independent semantic space.
arXiv Detail & Related papers (2020-01-14T02:05:14Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.