THiFLY Research at SemEval-2023 Task 7: A Multi-granularity System for
CTR-based Textual Entailment and Evidence Retrieval
- URL: http://arxiv.org/abs/2306.01245v1
- Date: Fri, 2 Jun 2023 03:09:31 GMT
- Title: THiFLY Research at SemEval-2023 Task 7: A Multi-granularity System for
CTR-based Textual Entailment and Evidence Retrieval
- Authors: Yuxuan Zhou, Ziyu Jin, Meiwei Li, Miao Li, Xien Liu, Xinxin You, Ji Wu
- Abstract summary: The NLI4CT task aims to entail hypotheses based on Clinical Trial Reports (CTRs) and retrieve the corresponding evidence supporting the justification.
We present a multi-granularity system for CTR-based textual entailment and evidence retrieval.
We enhance the numerical inference capability of the system by leveraging a T5-based model, SciFive, which is pre-trained on the medical corpus.
- Score: 13.30918296659228
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The NLI4CT task aims to entail hypotheses based on Clinical Trial Reports
(CTRs) and retrieve the corresponding evidence supporting the justification.
This task poses a significant challenge, as verifying hypotheses in the NLI4CT
task requires the integration of multiple pieces of evidence from one or two
CTR(s) and the application of diverse levels of reasoning, including textual
and numerical. To address these problems, we present a multi-granularity system
for CTR-based textual entailment and evidence retrieval in this paper.
Specifically, we construct a Multi-granularity Inference Network (MGNet) that
exploits sentence-level and token-level encoding to handle both textual
entailment and evidence retrieval tasks. Moreover, we enhance the numerical
inference capability of the system by leveraging a T5-based model, SciFive,
which is pre-trained on the medical corpus. Model ensembling and a joint
inference method are further utilized in the system to increase the stability
and consistency of inference. The system achieves f1-scores of 0.856 and 0.853
on textual entailment and evidence retrieval tasks, resulting in the best
performance on both subtasks. The experimental results corroborate the
effectiveness of our proposed method. Our code is publicly available at
https://github.com/THUMLP/NLI4CT.
Related papers
- Multi-modal Retrieval Augmented Multi-modal Generation: A Benchmark, Evaluate Metrics and Strong Baselines [63.427721165404634]
This paper investigates an intriguing task of Multi-modal Retrieval Augmented Multi-modal Generation (M$2$RAG)
This task requires foundation models to browse multi-modal web pages, with mixed text and images, and generate multi-modal responses for solving user queries.
We construct a benchmark for M$2$RAG task, equipped with a suite of text-modal metrics and multi-modal metrics to analyze the capabilities of existing foundation models.
arXiv Detail & Related papers (2024-11-25T13:20:19Z) - MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking [0.283600654802951]
We present a summarization model designed to generate claim-specific summaries useful for fact-checking from multimodal datasets.
We introduce a dynamic perceiver-based model that can handle inputs from multiple modalities of arbitrary lengths.
Our approach outperforms the SOTA approach by 4.6% in the claim verification task on the MOCHEG dataset.
arXiv Detail & Related papers (2024-07-18T01:33:20Z) - Narrative Action Evaluation with Prompt-Guided Multimodal Interaction [60.281405999483]
Narrative action evaluation (NAE) aims to generate professional commentary that evaluates the execution of an action.
NAE is a more challenging task because it requires both narrative flexibility and evaluation rigor.
We propose a prompt-guided multimodal interaction framework to facilitate the interaction between different modalities of information.
arXiv Detail & Related papers (2024-04-22T17:55:07Z) - UniDoc: A Universal Large Multimodal Model for Simultaneous Text
Detection, Recognition, Spotting and Understanding [93.92313947913831]
We introduce UniDoc, a novel multimodal model equipped with text detection and recognition capabilities.
To the best of our knowledge, this is the first large multimodal model capable of simultaneous text detection, recognition, spotting, and understanding.
arXiv Detail & Related papers (2023-08-19T17:32:34Z) - NLI4CT: Multi-Evidence Natural Language Inference for Clinical Trial
Reports [3.0468533447146244]
We present a novel resource to advance research on NLI for reasoning on clinical trial reports.
We provide NLI4CT, a corpus of 2400 statements and CTRs, annotated for these tasks.
To the best of our knowledge, we are the first to design a task that covers the interpretation of full CTRs.
arXiv Detail & Related papers (2023-05-05T15:03:01Z) - Evaluating and Improving Factuality in Multimodal Abstractive
Summarization [91.46015013816083]
We propose CLIPBERTScore to leverage the robustness and strong factuality detection performance between image-summary and document-summary.
We show that this simple combination of two metrics in the zero-shot achieves higher correlations than existing factuality metrics for document summarization.
Our analysis demonstrates the robustness and high correlation of CLIPBERTScore and its components on four factuality metric-evaluation benchmarks.
arXiv Detail & Related papers (2022-11-04T16:50:40Z) - R$^2$F: A General Retrieval, Reading and Fusion Framework for
Document-level Natural Language Inference [29.520857954199904]
Document-level natural language inference (DOCNLI) is a new challenging task in natural language processing.
We establish a general solution, named Retrieval, Reading and Fusion (R2F) framework, and a new setting.
Our experimental results show that R2F framework can obtain state-of-the-art performance and is robust for diverse evidence retrieval methods.
arXiv Detail & Related papers (2022-10-22T02:02:35Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z) - Knowledge-Enhanced Evidence Retrieval for Counterargument Generation [15.87727402948856]
We build a system that retrieves counterevidence from diverse sources on the Web.
At the core of this system is a natural language inference (NLI) model.
We present a knowledge-enhanced NLI model that aims to handle causality- and example-based inference.
arXiv Detail & Related papers (2021-09-19T04:31:21Z) - CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented
Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions.
We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD.
Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.