Robust (Controlled) Table-to-Text Generation with Structure-Aware
Equivariance Learning
- URL: http://arxiv.org/abs/2205.03972v1
- Date: Sun, 8 May 2022 23:37:27 GMT
- Title: Robust (Controlled) Table-to-Text Generation with Structure-Aware
Equivariance Learning
- Authors: Fei Wang, Zhewei Xu, Pedro Szekely and Muhao Chen
- Abstract summary: Controlled table-to-text generation seeks to generate natural language descriptions for highlighted subparts of a table.
We propose an equivariance learning framework, which encodes tables with a structure-aware self-attention mechanism.
Our technology is free to be plugged into existing table-to-text generation models, and has improved T5-based models to offer better performance on ToTTo and HiTab.
- Score: 24.233552674892906
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Controlled table-to-text generation seeks to generate natural language
descriptions for highlighted subparts of a table. Previous SOTA systems still
employ a sequence-to-sequence generation method, which merely captures the
table as a linear structure and is brittle when table layouts change. We seek
to go beyond this paradigm by (1) effectively expressing the relations of
content pieces in the table, and (2) making our model robust to
content-invariant structural transformations. Accordingly, we propose an
equivariance learning framework, which encodes tables with a structure-aware
self-attention mechanism. This prunes the full self-attention structure into an
order-invariant graph attention that captures the connected graph structure of
cells belonging to the same row or column, and it differentiates between
relevant cells and irrelevant cells from the structural perspective. Our
framework also modifies the positional encoding mechanism to preserve the
relative position of tokens in the same cell but enforce position invariance
among different cells. Our technology is free to be plugged into existing
table-to-text generation models, and has improved T5-based models to offer
better performance on ToTTo and HiTab. Moreover, on a harder version of ToTTo,
we preserve promising performance, while previous SOTA systems, even with
transformation-based data augmentation, have seen significant performance
drops. Our code is available at https://github.com/luka-group/Lattice.
Related papers
- Unifying Structured Data as Graph for Data-to-Text Pre-Training [69.96195162337793]
Data-to-text (D2T) generation aims to transform structured data into natural language text.
Data-to-text pre-training has proved to be powerful in enhancing D2T generation.
We propose a structure-enhanced pre-training method for D2T generation by designing a structure-enhanced Transformer.
arXiv Detail & Related papers (2024-01-02T12:23:49Z) - SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge)
We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution.
To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z) - TRUST: An Accurate and End-to-End Table structure Recognizer Using
Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST.
Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation.
We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z) - TableFormer: Robust Transformer Modeling for Table-Text Encoding [18.00127368618485]
Existing models for table understanding require linearization of the table structure, where row or column order is encoded as an unwanted bias.
In this work, we propose a robust and structurally aware table-text encoding architecture TableFormer.
arXiv Detail & Related papers (2022-03-01T07:23:06Z) - Split, embed and merge: An accurate table structure recognizer [42.579215135672094]
We introduce Split, Embed and Merge (SEM) as an accurate table structure recognizer.
SEM can achieve an average F-Measure of $96.9%$ on the SciTSR dataset.
arXiv Detail & Related papers (2021-07-12T06:26:19Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - Retrieving Complex Tables with Multi-Granular Graph Representation
Learning [20.72341939868327]
The task of natural language table retrieval seeks to retrieve semantically relevant tables based on natural language queries.
Existing learning systems treat tables as plain text based on the assumption that tables are structured as dataframes.
We propose Graph-based Table Retrieval (GTR), a generalizable NLTR framework with multi-granular graph representation learning.
arXiv Detail & Related papers (2021-05-04T20:19:03Z) - TCN: Table Convolutional Network for Web Table Interpretation [52.32515851633981]
We propose a novel table representation learning approach considering both the intra- and inter-table contextual information.
Our method can outperform competitive baselines by +4.8% of F1 for column type prediction and by +4.1% of F1 for column pairwise relation prediction.
arXiv Detail & Related papers (2021-02-17T02:18:10Z) - Towards Faithful Neural Table-to-Text Generation with Content-Matching
Constraints [63.84063384518667]
We propose a novel Transformer-based generation framework to achieve the goal.
Core techniques in our method to enforce faithfulness include a new table-text optimal-transport matching loss.
To evaluate faithfulness, we propose a new automatic metric specialized to the table-to-text generation problem.
arXiv Detail & Related papers (2020-05-03T02:54:26Z) - Identifying Table Structure in Documents using Conditional Generative
Adversarial Networks [0.0]
In many industries and in academic research, information is primarily transmitted in the form of unstructured documents.
We propose a top-down approach, first using a conditional generative adversarial network to map a table image into a standardised skeleton' table form.
We then deriving latent table structure using xy-cut projection and Genetic Algorithm optimisation.
arXiv Detail & Related papers (2020-01-13T20:42:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.