Enriching Transformers with Structured Tensor-Product Representations
for Abstractive Summarization
- URL: http://arxiv.org/abs/2106.01317v1
- Date: Wed, 2 Jun 2021 17:32:33 GMT
- Title: Enriching Transformers with Structured Tensor-Product Representations
for Abstractive Summarization
- Authors: Yichen Jiang, Asli Celikyilmaz, Paul Smolensky, Paul Soulos, Sudha
Rao, Hamid Palangi, Roland Fernandez, Caitlin Smith, Mohit Bansal, Jianfeng
Gao
- Abstract summary: We adapt TP-TRANSFORMER with the explicitly compositional Product Representation (TPR) for the task of abstractive summarization.
Key feature of our model is a structural bias that we introduce by encoding two separate representations for each token.
We show that our TP-TRANSFORMER outperforms the Transformer and the original TP-TRANSFORMER significantly on several abstractive summarization datasets.
- Score: 131.23966358405767
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abstractive summarization, the task of generating a concise summary of input
documents, requires: (1) reasoning over the source document to determine the
salient pieces of information scattered across the long document, and (2)
composing a cohesive text by reconstructing these salient facts into a shorter
summary that faithfully reflects the complex relations connecting these facts.
In this paper, we adapt TP-TRANSFORMER (Schlag et al., 2019), an architecture
that enriches the original Transformer (Vaswani et al., 2017) with the
explicitly compositional Tensor Product Representation (TPR), for the task of
abstractive summarization. The key feature of our model is a structural bias
that we introduce by encoding two separate representations for each token to
represent the syntactic structure (with role vectors) and semantic content
(with filler vectors) separately. The model then binds the role and filler
vectors into the TPR as the layer output. We argue that the structured
intermediate representations enable the model to take better control of the
contents (salient facts) and structures (the syntax that connects the facts)
when generating the summary. Empirically, we show that our TP-TRANSFORMER
outperforms the Transformer and the original TP-TRANSFORMER significantly on
several abstractive summarization datasets based on both automatic and human
evaluations. On several syntactic and semantic probing tasks, we demonstrate
the emergent structural information in the role vectors and improved syntactic
interpretability in the TPR layer outputs. Code and models are available at
https://github.com/jiangycTarheel/TPT-Summ.
Related papers
- ReSel: N-ary Relation Extraction from Scientific Text and Tables by
Learning to Retrieve and Select [53.071352033539526]
We study the problem of extracting N-ary relations from scientific articles.
Our proposed method ReSel decomposes this task into a two-stage procedure.
Our experiments on three scientific information extraction datasets show that ReSel outperforms state-of-the-art baselines significantly.
arXiv Detail & Related papers (2022-10-26T02:28:02Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - Source Code Summarization with Structural Relative Position Guided
Transformer [19.828300746504148]
Source code summarization aims at generating concise and clear natural language descriptions for programming languages.
Recent efforts focus on incorporating the syntax structure of code into neural networks such as Transformer.
We propose a Structural Relative Position guided Transformer, named SCRIPT.
arXiv Detail & Related papers (2022-02-14T07:34:33Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - HittER: Hierarchical Transformers for Knowledge Graph Embeddings [85.93509934018499]
We propose Hitt to learn representations of entities and relations in a complex knowledge graph.
Experimental results show that Hitt achieves new state-of-the-art results on multiple link prediction.
We additionally propose a simple approach to integrate Hitt into BERT and demonstrate its effectiveness on two Freebase factoid answering datasets.
arXiv Detail & Related papers (2020-08-28T18:58:15Z) - Do Syntax Trees Help Pre-trained Transformers Extract Information? [8.133145094593502]
We study the utility of incorporating dependency trees into pre-trained transformers on information extraction tasks.
We propose and investigate two distinct strategies for incorporating dependency structure.
We find that their performance gains are highly contingent on the availability of human-annotated dependency parses.
arXiv Detail & Related papers (2020-08-20T17:17:38Z) - StructSum: Summarization via Structured Representations [27.890477913486787]
Abstractive text summarization aims at compressing the information of a long source document into a condensed summary.
Despite advances in modeling techniques, abstractive summarization models still suffer from several key challenges.
We propose a framework based on document-level structure induction for summarization to address these challenges.
arXiv Detail & Related papers (2020-03-01T20:32:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.