Lexicon-constrained Copying Network for Chinese Abstractive
Summarization
- URL: http://arxiv.org/abs/2010.08197v2
- Date: Tue, 21 Dec 2021 16:13:22 GMT
- Title: Lexicon-constrained Copying Network for Chinese Abstractive
Summarization
- Authors: Boyan Wan, Mishal Sohail
- Abstract summary: Copy mechanism allows sequence-to-sequence models to choose words from the input and put them directly into the output.
Most existing models for Chinese abstractive summarization can only perform character copy.
We propose a lexicon-constrained copying network that models multi-granularity in both memory and decoder.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Copy mechanism allows sequence-to-sequence models to choose words from the
input and put them directly into the output, which is finding increasing use in
abstractive summarization. However, since there is no explicit delimiter in
Chinese sentences, most existing models for Chinese abstractive summarization
can only perform character copy, resulting in inefficient. To solve this
problem, we propose a lexicon-constrained copying network that models
multi-granularity in both encoder and decoder. On the source side, words and
characters are aggregated into the same input memory using a Transformerbased
encoder. On the target side, the decoder can copy either a character or a
multi-character word at each time step, and the decoding process is guided by a
word-enhanced search algorithm that facilitates the parallel computation and
encourages the model to copy more words. Moreover, we adopt a word selector to
integrate keyword information. Experiments results on a Chinese social media
dataset show that our model can work standalone or with the word selector. Both
forms can outperform previous character-based models and achieve competitive
performances.
Related papers
- Adapting Dual-encoder Vision-language Models for Paraphrased Retrieval [55.90407811819347]
We consider the task of paraphrased text-to-image retrieval where a model aims to return similar results given a pair of paraphrased queries.
We train a dual-encoder model starting from a language model pretrained on a large text corpus.
Compared to public dual-encoder models such as CLIP and OpenCLIP, the model trained with our best adaptation strategy achieves a significantly higher ranking similarity for paraphrased queries.
arXiv Detail & Related papers (2024-05-06T06:30:17Z) - Object Recognition as Next Token Prediction [99.40793702627396]
We present an approach to pose object recognition as next token prediction.
The idea is to apply a language decoder that auto-regressively predicts the text tokens from image embeddings to form labels.
arXiv Detail & Related papers (2023-12-04T18:58:40Z) - Copy Is All You Need [66.00852205068327]
We formulate text generation as progressively copying text segments from an existing text collection.
Our approach achieves better generation quality according to both automatic and human evaluations.
Our approach attains additional performance gains by simply scaling up to larger text collections.
arXiv Detail & Related papers (2023-07-13T05:03:26Z) - Attributable and Scalable Opinion Summarization [79.87892048285819]
We generate abstractive summaries by decoding frequent encodings, and extractive summaries by selecting the sentences assigned to the same frequent encodings.
Our method is attributable, because the model identifies sentences used to generate the summary as part of the summarization process.
It scales easily to many hundreds of input reviews, because aggregation is performed in the latent space rather than over long sequences of tokens.
arXiv Detail & Related papers (2023-05-19T11:30:37Z) - Inflected Forms Are Redundant in Question Generation Models [27.49894653349779]
We propose an approach to enhance the performance of Question Generation using an encoder-decoder framework.
Firstly, we identify the inflected forms of words from the input of encoder, and replace them with the root words.
Secondly, we propose to adapt QG as a combination of the following actions in the encode-decoder framework: generating a question word, copying a word from the source sequence or generating a word transformation type.
arXiv Detail & Related papers (2023-01-01T13:08:11Z) - CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence
Models [31.832217465573503]
We present a model with an explicit token-level copy operation and extend it to copying entire spans.
Our model provides hard alignments between spans in the input and output, allowing for nontraditional applications of seq2seq, like information extraction.
arXiv Detail & Related papers (2020-10-28T22:45:16Z) - 2kenize: Tying Subword Sequences for Chinese Script Conversion [54.33749520569979]
We propose a model that can disambiguate between mappings and convert between the two scripts.
Our proposed method outperforms previous Chinese Character conversion approaches by 6 points in accuracy.
arXiv Detail & Related papers (2020-05-07T10:53:05Z) - Neural Syntactic Preordering for Controlled Paraphrase Generation [57.5316011554622]
Our work uses syntactic transformations to softly "reorder'' the source sentence and guide our neural paraphrasing model.
First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model.
Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order.
arXiv Detail & Related papers (2020-05-05T09:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.