Fraunhofer SIT at CheckThat! 2023: Mixing Single-Modal Classifiers to
Estimate the Check-Worthiness of Multi-Modal Tweets
- URL: http://arxiv.org/abs/2307.00610v2
- Date: Thu, 27 Jul 2023 14:54:13 GMT
- Title: Fraunhofer SIT at CheckThat! 2023: Mixing Single-Modal Classifiers to
Estimate the Check-Worthiness of Multi-Modal Tweets
- Authors: Raphael Frick, Inna Vogel
- Abstract summary: This paper proposes a novel way of detecting the check-worthiness in multi-modal tweets.
It takes advantage of two classifiers, each trained on a single modality.
For image data, extracting the embedded text with an OCR analysis has shown to perform best.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The option of sharing images, videos and audio files on social media opens up
new possibilities for distinguishing between false information and fake news on
the Internet. Due to the vast amount of data shared every second on social
media, not all data can be verified by a computer or a human expert. Here, a
check-worthiness analysis can be used as a first step in the fact-checking
pipeline and as a filtering mechanism to improve efficiency. This paper
proposes a novel way of detecting the check-worthiness in multi-modal tweets.
It takes advantage of two classifiers, each trained on a single modality. For
image data, extracting the embedded text with an OCR analysis has shown to
perform best. By combining the two classifiers, the proposed solution was able
to place first in the CheckThat! 2023 Task 1A with an F1 score of 0.7297
achieved on the private test set.
Related papers
- Contrastive Transformer Learning with Proximity Data Generation for
Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery.
Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data.
In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z) - Gpachov at CheckThat! 2023: A Diverse Multi-Approach Ensemble for
Subjectivity Detection in News Articles [34.98368667957678]
This paper presents the solution built by the Gpachov team for the CLEF-2023 CheckThat! lab Task2 on subjectivity detection.
The three approaches are combined in a simple majority voting ensemble, resulting in 0.77 macro F1 on the test set and achieving 2nd place on the English subtask.
arXiv Detail & Related papers (2023-09-13T09:49:20Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - Perception Test: A Diagnostic Benchmark for Multimodal Video Models [78.64546291816117]
We propose a novel multimodal video benchmark to evaluate the perception and reasoning skills of pre-trained multimodal models.
The Perception Test focuses on skills (Memory, Abstraction, Physics, Semantics) and types of reasoning (descriptive, explanatory, predictive, counterfactual) across video, audio, and text modalities.
The benchmark probes pre-trained models for their transfer capabilities, in a zero-shot / few-shot or limited finetuning regime.
arXiv Detail & Related papers (2023-05-23T07:54:37Z) - Noise-Robust De-Duplication at Scale [4.499833362998488]
This study uses the unique timeliness of historical news wires to create a 27,210 document dataset.
We develop and evaluate a range of de-duplication methods, including hashing and N-gram overlap.
We show that the bi-encoder scales well, de-duplicating a 10 million article corpus on a single GPU card in a matter of hours.
arXiv Detail & Related papers (2022-10-09T13:30:42Z) - Learning Audio-Visual embedding for Wild Person Verification [18.488385598522125]
We propose an audio-visual network that considers aggregator from a fusion perspective.
We introduce improved attentive statistics pooling for the first time in face verification.
Finally, fuse the modality with a gated attention mechanism.
arXiv Detail & Related papers (2022-09-09T02:29:47Z) - Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal
Misinformation [83.2079454464572]
This paper describes our approach to the Image-Text Inconsistency Detection challenge of the DARPA Semantic Forensics (SemaFor) Program.
We collect Twitter-COMMs, a large-scale multimodal dataset with 884k tweets relevant to the topics of Climate Change, COVID-19, and Military Vehicles.
We train our approach, based on the state-of-the-art CLIP model, leveraging automatically generated random and hard negatives.
arXiv Detail & Related papers (2021-12-16T03:37:20Z) - One-shot Key Information Extraction from Document with Deep Partial
Graph Matching [60.48651298832829]
Key Information Extraction (KIE) from documents improves efficiency, productivity, and security in many industrial scenarios.
Existing supervised learning methods for the KIE task need to feed a large number of labeled samples and learn separate models for different types of documents.
We propose a deep end-to-end trainable network for one-shot KIE using partial graph matching.
arXiv Detail & Related papers (2021-09-26T07:45:53Z) - Visualizing Classifier Adjacency Relations: A Case Study in Speaker
Verification and Voice Anti-Spoofing [72.4445825335561]
We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers.
Based upon rank correlations, our method facilitates a visual comparison of classifiers with arbitrary scores.
While the approach is fully versatile and can be applied to any detection task, we demonstrate the method using scores produced by automatic speaker verification and voice anti-spoofing systems.
arXiv Detail & Related papers (2021-06-11T13:03:33Z) - A Convolutional Baseline for Person Re-Identification Using Vision and
Language Descriptions [24.794592610444514]
In real-world surveillance scenarios, frequently no visual information will be available about the queried person.
A two stream deep convolutional neural network framework supervised by cross entropy loss is presented.
The learnt visual representations are more robust and perform 22% better during retrieval as compared to a single modality system.
arXiv Detail & Related papers (2020-02-20T10:12:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.