TCN: Table Convolutional Network for Web Table Interpretation
- URL: http://arxiv.org/abs/2102.09460v1
- Date: Wed, 17 Feb 2021 02:18:10 GMT
- Title: TCN: Table Convolutional Network for Web Table Interpretation
- Authors: Daheng Wang, Prashant Shiralkar, Colin Lockard, Binxuan Huang, Xin
Luna Dong, Meng Jiang
- Abstract summary: We propose a novel table representation learning approach considering both the intra- and inter-table contextual information.
Our method can outperform competitive baselines by +4.8% of F1 for column type prediction and by +4.1% of F1 for column pairwise relation prediction.
- Score: 52.32515851633981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Information extraction from semi-structured webpages provides valuable
long-tailed facts for augmenting knowledge graph. Relational Web tables are a
critical component containing additional entities and attributes of rich and
diverse knowledge. However, extracting knowledge from relational tables is
challenging because of sparse contextual information. Existing work linearize
table cells and heavily rely on modifying deep language models such as BERT
which only captures related cells information in the same table. In this work,
we propose a novel relational table representation learning approach
considering both the intra- and inter-table contextual information. On one
hand, the proposed Table Convolutional Network model employs the attention
mechanism to adaptively focus on the most informative intra-table cells of the
same row or column; and, on the other hand, it aggregates inter-table
contextual information from various types of implicit connections between cells
across different tables. Specifically, we propose three novel aggregation
modules for (i) cells of the same value, (ii) cells of the same schema
position, and (iii) cells linked to the same page topic. We further devise a
supervised multi-task training objective for jointly predicting column type and
pairwise column relation, as well as a table cell recovery objective for
pre-training. Experiments on real Web table datasets demonstrate our method can
outperform competitive baselines by +4.8% of F1 for column type prediction and
by +4.1% of F1 for pairwise column relation prediction.
Related papers
- Multi-Cell Decoder and Mutual Learning for Table Structure and Character Recognition [1.2328446298523066]
We propose a multi-cell content decoder and bidirectional mutual learning mechanism to improve the end-to-end approach.
The effectiveness is demonstrated on two large datasets, and the experimental results show comparable performance to state-of-the-art models.
arXiv Detail & Related papers (2024-04-20T04:30:38Z) - TRACE: Table Reconstruction Aligned to Corner and Edges [7.536220920052911]
We analyze the natural characteristics of a table, where a table is composed of cells and each cell is made up of borders consisting of edges.
We propose a novel method to reconstruct the table in a bottom-up manner.
A simple design makes the model easier to train and requires less computation than previous two-stage methods.
arXiv Detail & Related papers (2023-05-01T02:26:15Z) - TRUST: An Accurate and End-to-End Table structure Recognizer Using
Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST.
Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation.
We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z) - Table Retrieval May Not Necessitate Table-specific Model Design [83.27735758203089]
We focus on the task of table retrieval, and ask: "is table-specific model design necessary for table retrieval?"
Based on an analysis on a table-based portion of the Natural Questions dataset (NQ-table), we find that structure plays a negligible role in more than 70% of the cases.
We then experiment with three modules to explicitly encode table structures, namely auxiliary row/column embeddings, hard attention masks, and soft relation-based attention biases.
None of these yielded significant improvements, suggesting that table-specific model design may not be necessary for table retrieval.
arXiv Detail & Related papers (2022-05-19T20:35:23Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - TabularNet: A Neural Network Architecture for Understanding Semantic
Structures of Tabular Data [30.479822289380255]
We propose a novel neural network architecture, TabularNet, to simultaneously extract spatial and relational information from tables.
For relational information, we design a new graph construction method based on the WordNet tree and adopt a Graph Convolutional Network (GCN) based encoder.
Our neural network architecture can be a unified neural backbone for different understanding tasks and utilized in a multitask scenario.
arXiv Detail & Related papers (2021-06-06T11:48:09Z) - TABBIE: Pretrained Representations of Tabular Data [22.444607481407633]
We devise a simple pretraining objective that learns exclusively from tabular data.
Unlike competing approaches, our model (TABBIE) provides embeddings of all table substructures.
A qualitative analysis of our model's learned cell, column, and row representations shows that it understands complex table semantics and numerical trends.
arXiv Detail & Related papers (2021-05-06T11:15:16Z) - A Graph Representation of Semi-structured Data for Web Question
Answering [96.46484690047491]
We propose a novel graph representation of Web tables and lists based on a systematic categorization of the components in semi-structured data as well as their relations.
Our method improves F1 score by 3.90 points over the state-of-the-art baselines.
arXiv Detail & Related papers (2020-10-14T04:01:54Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.