IMCI: Integrate Multi-view Contextual Information for Fact Extraction
and Verification
- URL: http://arxiv.org/abs/2208.14001v1
- Date: Tue, 30 Aug 2022 05:57:34 GMT
- Title: IMCI: Integrate Multi-view Contextual Information for Fact Extraction
and Verification
- Authors: Hao Wang, Yangguang Li, Zhen Huang, Yong Dou
- Abstract summary: We propose to integrate multi-view contextual information (IMCI) for fact extraction and verification.
Our experimental results on FEVER 1.0 shared task show that our IMCI framework makes great progress on both fact extraction and verification.
- Score: 19.764122035213067
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: With the rapid development of automatic fake news detection technology, fact
extraction and verification (FEVER) has been attracting more attention. The
task aims to extract the most related fact evidences from millions of
open-domain Wikipedia documents and then verify the credibility of
corresponding claims. Although several strong models have been proposed for the
task and they have made great progress, we argue that they fail to utilize
multi-view contextual information and thus cannot obtain better performance. In
this paper, we propose to integrate multi-view contextual information (IMCI)
for fact extraction and verification. For each evidence sentence, we define two
kinds of context, i.e. intra-document context and inter-document context}.
Intra-document context consists of the document title and all the other
sentences from the same document. Inter-document context consists of all other
evidences which may come from different documents. Then we integrate the
multi-view contextual information to encode the evidence sentences to handle
the task. Our experimental results on FEVER 1.0 shared task show that our IMCI
framework makes great progress on both fact extraction and verification, and
achieves state-of-the-art performance with a winning FEVER score of 72.97% and
label accuracy of 75.84% on the online blind test set. We also conduct ablation
study to detect the impact of multi-view contextual information. Our codes will
be released at https://github.com/phoenixsecularbird/IMCI.
Related papers
- Unified Multi-Modal Interleaved Document Representation for Information Retrieval [57.65409208879344]
We produce more comprehensive and nuanced document representations by holistically embedding documents interleaved with different modalities.
Specifically, we achieve this by leveraging the capability of recent vision-language models that enable the processing and integration of text, images, and tables into a unified format and representation.
arXiv Detail & Related papers (2024-10-03T17:49:09Z) - Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings.
First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss.
Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z) - MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking [0.283600654802951]
We present a summarization model designed to generate claim-specific summaries useful for fact-checking from multimodal datasets.
We introduce a dynamic perceiver-based model that can handle inputs from multiple modalities of arbitrary lengths.
Our approach outperforms the SOTA approach by 4.6% in the claim verification task on the MOCHEG dataset.
arXiv Detail & Related papers (2024-07-18T01:33:20Z) - TRIE++: Towards End-to-End Information Extraction from Visually Rich
Documents [51.744527199305445]
This paper proposes a unified end-to-end information extraction framework from visually rich documents.
Text reading and information extraction can reinforce each other via a well-designed multi-modal context block.
The framework can be trained in an end-to-end trainable manner, achieving global optimization.
arXiv Detail & Related papers (2022-07-14T08:52:07Z) - Layout-Aware Information Extraction for Document-Grounded Dialogue:
Dataset, Method and Demonstration [75.47708732473586]
We propose a layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents.
LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents.
Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about.
arXiv Detail & Related papers (2022-07-14T07:59:45Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z) - Towards Robust Visual Information Extraction in Real World: New Dataset
and Novel Solution [30.438041837029875]
We propose a robust visual information extraction system (VIES) towards real-world scenarios.
VIES is a unified end-to-end trainable framework for simultaneous text detection, recognition and information extraction.
We construct a fully-annotated dataset called EPHOIE, which is the first Chinese benchmark for both text spotting and visual information extraction.
arXiv Detail & Related papers (2021-01-24T11:05:24Z) - TRIE: End-to-End Text Reading and Information Extraction for Document
Understanding [56.1416883796342]
We propose a unified end-to-end text reading and information extraction network.
multimodal visual and textual features of text reading are fused for information extraction.
Our proposed method significantly outperforms the state-of-the-art methods in both efficiency and accuracy.
arXiv Detail & Related papers (2020-05-27T01:47:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.