Team Triple-Check at Factify 2: Parameter-Efficient Large Foundation
Models with Feature Representations for Multi-Modal Fact Verification
- URL: http://arxiv.org/abs/2302.07740v1
- Date: Sun, 12 Feb 2023 18:08:54 GMT
- Title: Team Triple-Check at Factify 2: Parameter-Efficient Large Foundation
Models with Feature Representations for Multi-Modal Fact Verification
- Authors: Wei-Wei Du, Hong-Wei Wu, Wei-Yao Wang, Wen-Chih Peng
- Abstract summary: Multi-modal fact verification has become an important but challenging issue on social media.
In this paper, we propose the Pre-CoFactv2 framework for modeling fine-grained text and input embeddings with lightening parameters.
We show that Pre-CoFactv2 outperforms Pre-CoFact by a large margin and achieved new state-of-the-art results at the Factify challenge at AAAI 2023.
- Score: 5.552606716659022
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-modal fact verification has become an important but challenging issue
on social media due to the mismatch between the text and images in the
misinformation of news content, which has been addressed by considering
cross-modalities to identify the veracity of the news in recent years. In this
paper, we propose the Pre-CoFactv2 framework with new parameter-efficient
foundation models for modeling fine-grained text and input embeddings with
lightening parameters, multi-modal multi-type fusion for not only capturing
relations for the same and different modalities but also for different types
(i.e., claim and document), and feature representations for explicitly
providing metadata for each sample. In addition, we introduce a unified
ensemble method to boost model performance by adjusting the importance of each
trained model with not only the weights but also the powers. Extensive
experiments show that Pre-CoFactv2 outperforms Pre-CoFact by a large margin and
achieved new state-of-the-art results at the Factify challenge at AAAI 2023. We
further illustrate model variations to verify the relative contributions of
different components. Our team won the first prize (F1-score: 81.82%) and we
made our code publicly available at
https://github.com/wwweiwei/Pre-CoFactv2-AAAI-2023.
Related papers
- Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging [21.918559935122786]
Model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training.
Traditional model merging methods often show significant performance gaps compared to fine-tuned models.
We show that both shared and exclusive task-specific knowledge are crucial for merging performance, but directly merging exclusive knowledge hinders overall performance.
We propose Twin-Merging, a method that encompasses two principal stages: (1) modularizing knowledge into shared and exclusive components, with compression to reduce redundancy and enhance efficiency; (2) dynamically merging shared and task-specific knowledge based on the input.
arXiv Detail & Related papers (2024-06-17T02:31:55Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion [54.33764537135906]
VideoQA Transformer models demonstrate competitive performance on standard benchmarks.
Do these models capture the rich multimodal structures and dynamics from video and text jointly?
Are they achieving high scores by exploiting biases and spurious features?
arXiv Detail & Related papers (2023-06-15T06:45:46Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - An Empirical Study of Multimodal Model Merging [148.48412442848795]
Model merging is a technique that fuses multiple models trained on different tasks to generate a multi-task solution.
We conduct our study for a novel goal where we can merge vision, language, and cross-modal transformers of a modality-specific architecture.
We propose two metrics that assess the distance between weights to be merged and can serve as an indicator of the merging outcomes.
arXiv Detail & Related papers (2023-04-28T15:43:21Z) - Model ensemble instead of prompt fusion: a sample-specific knowledge
transfer method for few-shot prompt tuning [85.55727213502402]
We focus on improving the few-shot performance of prompt tuning by transferring knowledge from soft prompts of source tasks.
We propose Sample-specific Ensemble of Source Models (SESoM)
SESoM learns to adjust the contribution of each source model for each target sample separately when ensembling source model outputs.
arXiv Detail & Related papers (2022-10-23T01:33:16Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - Team Yao at Factify 2022: Utilizing Pre-trained Models and Co-attention
Networks for Multi-Modal Fact Verification [7.3724108865167945]
We propose a framework, Pre-CoFact, composed of two pre-trained models for extracting features from text and images.
We adopt the ensemble method by using different pre-trained models in Pre-CoFact to achieve better performance.
Our model achieved competitive performance without using auxiliary tasks or extra information.
arXiv Detail & Related papers (2022-01-26T16:04:37Z) - Logically at the Factify 2022: Multimodal Fact Verification [2.8914815569249823]
This paper describes our participant system for the multi-modal fact verification (Factify) challenge at AAAI 2022.
Two baseline approaches are proposed and explored including an ensemble model and a multi-modal attention network.
Our best model is ranked first in leaderboard which obtains a weighted average F-measure of 0.77 on both validation and test set.
arXiv Detail & Related papers (2021-12-16T23:34:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.