INarIG: Iterative Non-autoregressive Instruct Generation Model For
Word-Level Auto Completion
- URL: http://arxiv.org/abs/2311.18200v1
- Date: Thu, 30 Nov 2023 02:39:38 GMT
- Title: INarIG: Iterative Non-autoregressive Instruct Generation Model For
Word-Level Auto Completion
- Authors: Hengchao Shang, Zongyao Li, Daimeng Wei, Jiaxin Guo, Minghan Wang,
Xiaoyu Chen, Lizhi Lei, Hao Yang
- Abstract summary: Word-Level Auto Completion (WLAC) predicts a target word given a source sentence, translation context, and a human typed character sequence.
We propose the INarIG (Iterative Non-autoregressive Instruct Generation) model, which constructs the human typed sequence into Instruction Unit.
Our model is more competent in dealing with low-frequency words, and achieves state-of-the-art results on the WMT22 and benchmark datasets, with a maximum increase of over 10% prediction accuracy.
- Score: 11.72797729874854
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Computer-aided translation (CAT) aims to enhance human translation efficiency
and is still important in scenarios where machine translation cannot meet
quality requirements. One fundamental task within this field is Word-Level Auto
Completion (WLAC). WLAC predicts a target word given a source sentence,
translation context, and a human typed character sequence. Previous works
either employ word classification models to exploit contextual information from
both sides of the target word or directly disregarded the dependencies from the
right-side context. Furthermore, the key information, i.e. human typed
sequences, is only used as prefix constraints in the decoding module. In this
paper, we propose the INarIG (Iterative Non-autoregressive Instruct Generation)
model, which constructs the human typed sequence into Instruction Unit and
employs iterative decoding with subwords to fully utilize input information
given in the task. Our model is more competent in dealing with low-frequency
words (core scenario of this task), and achieves state-of-the-art results on
the WMT22 and benchmark datasets, with a maximum increase of over 10%
prediction accuracy.
Related papers
- An Energy-based Model for Word-level AutoCompletion in Computer-aided Translation [97.3797716862478]
Word-level AutoCompletion (WLAC) is a rewarding yet challenging task in Computer-aided Translation.
Existing work addresses this task through a classification model based on a neural network that maps the hidden vector of the input context into its corresponding label.
This work proposes an energy-based model for WLAC, which enables the context hidden vector to capture crucial information from the source sentence.
arXiv Detail & Related papers (2024-07-29T15:07:19Z) - Enhancing Medical Specialty Assignment to Patients using NLP Techniques [0.0]
We propose an alternative approach that achieves superior performance while being computationally efficient.
Specifically, we utilize keywords to train a deep learning architecture that outperforms a language model pretrained on a large corpus of text.
Our results demonstrate that utilizing keywords for text classification significantly improves classification performance.
arXiv Detail & Related papers (2023-12-09T14:13:45Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - CompoundPiece: Evaluating and Improving Decompounding Performance of
Language Models [77.45934004406283]
We systematically study decompounding, the task of splitting compound words into their constituents.
We introduce a dataset of 255k compound and non-compound words across 56 diverse languages obtained from Wiktionary.
We introduce a novel methodology to train dedicated models for decompounding.
arXiv Detail & Related papers (2023-05-23T16:32:27Z) - Towards Computationally Verifiable Semantic Grounding for Language
Models [18.887697890538455]
The paper conceptualizes the LM as a conditional model generating text given a desired semantic message formalized as a set of entity-relationship triples.
It embeds the LM in an auto-encoder by feeding its output to a semantic fluency whose output is in the same representation domain as the input message.
We show that our proposed approaches significantly improve on the greedy search baseline.
arXiv Detail & Related papers (2022-11-16T17:35:52Z) - How does a Pre-Trained Transformer Integrate Contextual Keywords?
Application to Humanitarian Computing [0.0]
This paper describes how to improve a humanitarian classification task by adding the crisis event type to each tweet to be classified.
It shows how the proposed neural network approach is partially over-fitting the particularities of the Crisis Benchmark.
arXiv Detail & Related papers (2021-11-07T11:24:08Z) - Dict-BERT: Enhancing Language Model Pre-training with Dictionary [42.0998323292348]
Pre-trained language models (PLMs) aim to learn universal language representations by conducting self-supervised training tasks on large-scale corpora.
In this work, we focus on enhancing language model pre-training by leveraging definitions of rare words in dictionaries.
We propose two novel self-supervised pre-training tasks on word and sentence-level alignment between input text sequence and rare word definitions.
arXiv Detail & Related papers (2021-10-13T04:29:14Z) - Template Controllable keywords-to-text Generation [16.255080737147384]
The model takes as input a set of un-ordered keywords, and part-of-speech (POS) based template instructions.
The framework is based on the encode-attend-decode paradigm, where keywords and templates are encoded first, and the decoder judiciously attends over the contexts derived from the encoded keywords and templates to generate the sentences.
arXiv Detail & Related papers (2020-11-07T08:05:58Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.