Sattiy at SemEval-2021 Task 9: An Ensemble Solution for Statement
Verification and Evidence Finding with Tables
- URL: http://arxiv.org/abs/2104.10366v1
- Date: Wed, 21 Apr 2021 06:11:49 GMT
- Title: Sattiy at SemEval-2021 Task 9: An Ensemble Solution for Statement
Verification and Evidence Finding with Tables
- Authors: Xiaoyi Ruan, Meizhi Jin, Jian Ma, Haiqin Yang, Lianxin Jiang, Yang Mo,
Mengyuan Zhou
- Abstract summary: This paper describes sattiy team's system in SemEval-2021 task 9: Statement Verification and Evidence Finding with Tables (SEM-TAB-FACT)
This competition aims to verify statements and to find evidence from tables for scientific articles.
In this paper, we exploited ensemble models of pre-trained language models over tables, TaPas and TaBERT, for Task A and adjust the result based on some rules extracted for Task B.
- Score: 4.691435917434472
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Question answering from semi-structured tables can be seen as a semantic
parsing task and is significant and practical for pushing the boundary of
natural language understanding. Existing research mainly focuses on
understanding contents from unstructured evidence, e.g., news, natural language
sentences, and documents. The task of verification from structured evidence,
such as tables, charts, and databases, is still less explored. This paper
describes sattiy team's system in SemEval-2021 task 9: Statement Verification
and Evidence Finding with Tables (SEM-TAB-FACT). This competition aims to
verify statements and to find evidence from tables for scientific articles and
to promote the proper interpretation of the surrounding article. In this paper,
we exploited ensemble models of pre-trained language models over tables, TaPas
and TaBERT, for Task A and adjust the result based on some rules extracted for
Task B. Finally, in the leaderboard, we attain the F1 scores of 0.8496 and
0.7732 in Task A for the 2-way and 3-way evaluation, respectively, and the F1
score of 0.4856 in Task B.
Related papers
- TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy [81.76462101465354]
We present a novel large vision-hugging model, TabPedia, equipped with a concept synergy mechanism.
This unified framework allows TabPedia to seamlessly integrate VTU tasks, such as table detection, table structure recognition, table querying, and table question answering.
To better evaluate the VTU task in real-world scenarios, we establish a new and comprehensive table VQA benchmark, ComTQA.
arXiv Detail & Related papers (2024-06-03T13:54:05Z) - Bridging the Gap: Deciphering Tabular Data Using Large Language Model [4.711941969101732]
This research marks the first application of large language models to table-based question answering tasks.
We have architected a distinctive module dedicated to the serialization of tables for seamless integration with expansive language models.
arXiv Detail & Related papers (2023-08-23T03:38:21Z) - QTSumm: Query-Focused Summarization over Tabular Data [58.62152746690958]
People primarily consult tables to conduct data analysis or answer specific questions.
We define a new query-focused table summarization task, where text generation models have to perform human-like reasoning.
We introduce a new benchmark named QTSumm for this task, which contains 7,111 human-annotated query-summary pairs over 2,934 tables.
arXiv Detail & Related papers (2023-05-23T17:43:51Z) - Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text
Documents via Semantic-Oriented Hierarchical Graphs [79.0426838808629]
We propose TAT-DQA, i.e. to answer the question over a visually-rich table-text document.
Specifically, we propose a novel Doc2SoarGraph framework with enhanced discrete reasoning capability.
We conduct extensive experiments on TAT-DQA dataset, and the results show that our proposed framework outperforms the best baseline model by 17.73% and 16.91% in terms of Exact Match (EM) and F1 score respectively on the test set.
arXiv Detail & Related papers (2023-05-03T07:30:32Z) - SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence
Embedding [12.843166994677286]
This paper presents the shared task on Multilingualaticity Detection and Sentence Embedding.
It consists of two subtasks: (a) a binary classification one aimed at identifying whether a sentence contains an idiomatic expression, and (b) a task based on semantic text similarity which requires the model to adequately represent potentially idiomatic expressions in context.
The task had close to 100 registered participants organised into twenty five teams making over 650 and 150 submissions in the practice and evaluation phases respectively.
arXiv Detail & Related papers (2022-04-21T12:20:52Z) - Volta at SemEval-2021 Task 9: Statement Verification and Evidence
Finding with Tables using TAPAS and Transfer Learning [19.286478269708592]
We present our systems to solve Task 9 of SemEval-2021: Statement Verification and Evidence Finding with Tables.
The task consists of two subtasks: (A) Given a table and a statement, predicting whether the table supports the statement and (B) Predicting which cells in the table provide evidence for/against the statement.
Our systems achieve an F1 score of 67.34 in subtask A three-way classification, 72.89 in subtask A two-way classification, and 62.95 in subtask B.
arXiv Detail & Related papers (2021-06-01T06:06:29Z) - SemEval-2021 Task 9: Fact Verification and Evidence Finding for Tabular
Data in Scientific Documents (SEM-TAB-FACTS) [0.0]
SEM-TAB-FACTS featured two sub-tasks.
In sub-task A, the goal was to determine if a statement is supported, refuted or unknown in relation to a table.
In sub-task B, the focus was on identifying the specific cells of a table that provide evidence for the statement.
arXiv Detail & Related papers (2021-05-28T17:21:11Z) - BreakingBERT@IITK at SemEval-2021 Task 9 : Statement Verification and
Evidence Finding with Tables [1.78256232654567]
We tackle the problem of fact verification and evidence finding over tabular data.
We make a comparison of the baselines and state-of-the-art approaches over the given SemTabFact dataset.
We also propose a novel approach CellBERT to solve evidence finding as a form of the Natural Language Inference task.
arXiv Detail & Related papers (2021-04-07T11:41:07Z) - A Graph Representation of Semi-structured Data for Web Question
Answering [96.46484690047491]
We propose a novel graph representation of Web tables and lists based on a systematic categorization of the components in semi-structured data as well as their relations.
Our method improves F1 score by 3.90 points over the state-of-the-art baselines.
arXiv Detail & Related papers (2020-10-14T04:01:54Z) - TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data [113.29476656550342]
We present TaBERT, a pretrained LM that jointly learns representations for NL sentences and tables.
TaBERT is trained on a large corpus of 26 million tables and their English contexts.
Implementation of the model will be available at http://fburl.com/TaBERT.
arXiv Detail & Related papers (2020-05-17T17:26:40Z) - ToTTo: A Controlled Table-To-Text Generation Dataset [61.83159452483026]
ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples.
We introduce a dataset construction process where annotators directly revise existing candidate sentences from Wikipedia.
While usually fluent, existing methods often hallucinate phrases that are not supported by the table.
arXiv Detail & Related papers (2020-04-29T17:53:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.