IAI Group at CheckThat! 2024: Transformer Models and Data Augmentation for Checkworthy Claim Detection
- URL: http://arxiv.org/abs/2408.01118v1
- Date: Fri, 2 Aug 2024 08:59:09 GMT
- Title: IAI Group at CheckThat! 2024: Transformer Models and Data Augmentation for Checkworthy Claim Detection
- Authors: Peter Røysland Aarnes, Vinay Setty, Petra Galuščáková,
- Abstract summary: This paper describes IAI group's participation for automated check-worthiness estimation for claims.
The task involves the automated detection of check-worthy claims in English, Dutch, and Arabic political debates and Twitter data.
We utilize various pre-trained generative decoder and encoder transformer models, employing methods such as few-shot chain-of-thought reasoning.
- Score: 1.3686993145787067
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper describes IAI group's participation for automated check-worthiness estimation for claims, within the framework of the 2024 CheckThat! Lab "Task 1: Check-Worthiness Estimation". The task involves the automated detection of check-worthy claims in English, Dutch, and Arabic political debates and Twitter data. We utilized various pre-trained generative decoder and encoder transformer models, employing methods such as few-shot chain-of-thought reasoning, fine-tuning, data augmentation, and transfer learning from one language to another. Despite variable success in terms of performance, our models achieved notable placements on the organizer's leaderboard: ninth-best in English, third-best in Dutch, and the top placement in Arabic, utilizing multilingual datasets for enhancing the generalizability of check-worthiness detection. Despite a significant drop in performance on the unlabeled test dataset compared to the development test dataset, our findings contribute to the ongoing efforts in claim detection research, highlighting the challenges and potential of language-specific adaptations in claim verification systems.
Related papers
- OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion [88.59397418187226]
We propose a novel unified open-vocabulary detection method called OV-DINO.
It is pre-trained on diverse large-scale datasets with language-aware selective fusion in a unified framework.
We evaluate the performance of the proposed OV-DINO on popular open-vocabulary detection benchmarks.
arXiv Detail & Related papers (2024-07-10T17:05:49Z) - The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights [108.40766216456413]
We propose a question alignment framework to bridge the gap between large language models' English and non-English performance.
Experiment results show it can boost multilingual performance across diverse reasoning scenarios, model families, and sizes.
We analyze representation space, generated response and data scales, and reveal how question translation training strengthens language alignment within LLMs.
arXiv Detail & Related papers (2024-05-02T14:49:50Z) - Machine Translation Meta Evaluation through Translation Accuracy
Challenge Sets [92.38654521870444]
We introduce ACES, a contrastive challenge set spanning 146 language pairs.
This dataset aims to discover whether metrics can identify 68 translation accuracy errors.
We conduct a large-scale study by benchmarking ACES on 50 metrics submitted to the WMT 2022 and 2023 metrics shared tasks.
arXiv Detail & Related papers (2024-01-29T17:17:42Z) - Multilingual and Multi-topical Benchmark of Fine-tuned Language models and Large Language Models for Check-Worthy Claim Detection [1.4779899760345434]
This study compares the performance of (1) fine-tuned language models and (2) large language models on the task of check-worthy claim detection.
We composed a multilingual and multi-topical dataset comprising texts of various sources and styles.
arXiv Detail & Related papers (2023-11-10T15:36:35Z) - A deep Natural Language Inference predictor without language-specific
training data [44.26507854087991]
We present a technique of NLP to tackle the problem of inference relation (NLI) between pairs of sentences in a target language of choice without a language-specific training dataset.
We exploit a generic translation dataset, manually translated, along with two instances of the same pre-trained model.
The model has been evaluated over machine translated Stanford NLI test dataset, machine translated Multi-Genre NLI test dataset, and manually translated RTE3-ITA test dataset.
arXiv Detail & Related papers (2023-09-06T10:20:59Z) - Check-worthy Claim Detection across Topics for Automated Fact-checking [21.723689314962233]
We assess and quantify the challenge of detecting check-worthy claims for new, unseen topics.
We propose the AraCWA model to mitigate the performance deterioration when detecting check-worthy claims across topics.
arXiv Detail & Related papers (2022-12-16T14:54:56Z) - No Language Left Behind: Scaling Human-Centered Machine Translation [69.28110770760506]
We create datasets and models aimed at narrowing the performance gap between low and high-resource languages.
We propose multiple architectural and training improvements to counteract overfitting while training on thousands of tasks.
Our model achieves an improvement of 44% BLEU relative to the previous state-of-the-art.
arXiv Detail & Related papers (2022-07-11T07:33:36Z) - IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark.
IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages.
We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z) - UPV at CheckThat! 2021: Mitigating Cultural Differences for Identifying
Multilingual Check-worthy Claims [6.167830237917659]
In this paper, we propose a language identification task as an auxiliary task to mitigate unintended bias.
Our results show that joint training of language identification and check-worthy claim detection tasks can provide performance gains for some of the selected languages.
arXiv Detail & Related papers (2021-09-19T21:46:16Z) - Accenture at CheckThat! 2021: Interesting claim identification and
ranking with contextually sensitive lexical training data augmentation [0.0]
This paper discusses the approach used by the Accenture Team for CLEF2021 CheckThat! Lab, Task 1.
It identifies whether a claim made in social media would be interesting to a wide audience and should be fact-checked.
Twitter training and test data were provided in English, Arabic, Spanish, Turkish, and Bulgarian.
arXiv Detail & Related papers (2021-07-12T18:46:47Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.