Related papers: AKCIT-FN at CheckThat! 2025: Switching Fine-Tuned SLMs and LLM Prompting for Multilingual Claim Normalization

AKCIT-FN at CheckThat! 2025: Switching Fine-Tuned SLMs and LLM Prompting for Multilingual Claim Normalization

URL: http://arxiv.org/abs/2509.11496v1
Date: Mon, 15 Sep 2025 01:19:49 GMT
Title: AKCIT-FN at CheckThat! 2025: Switching Fine-Tuned SLMs and LLM Prompting for Multilingual Claim Normalization
Authors: Fabrycio Leite Nakano Almada, Kauan Divino Pouso Mariano, Maykon Adriell Dutra, Victor Emanuel da Silva Monteiro, Juliana Resplande Sant'Anna Gomes, Arlindo Rodrigues Galvão Filho, Anderson da Silva Soares,
Abstract summary: Claim normalization is a crucial step in automated fact-checking pipelines.<n>This paper details our submission to the CLEF-2025 CheckThat! Task2, which challenges systems to perform claim normalization across twenty languages.
Score: 0.5274891943689054
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Claim normalization, the transformation of informal social media posts into concise, self-contained statements, is a crucial step in automated fact-checking pipelines. This paper details our submission to the CLEF-2025 CheckThat! Task~2, which challenges systems to perform claim normalization across twenty languages, divided into thirteen supervised (high-resource) and seven zero-shot (no training data) tracks. Our approach, leveraging fine-tuned Small Language Models (SLMs) for supervised languages and Large Language Model (LLM) prompting for zero-shot scenarios, achieved podium positions (top three) in fifteen of the twenty languages. Notably, this included second-place rankings in eight languages, five of which were among the seven designated zero-shot languages, underscoring the effectiveness of our LLM-based zero-shot strategy. For Portuguese, our initial development language, our system achieved an average METEOR score of 0.5290, ranking third. All implementation artifacts, including inference, training, evaluation scripts, and prompt configurations, are publicly available at https://github.com/ju-resplande/checkthat2025_normalization.

Related papers

Anka: A Domain-Specific Language for Reliable LLM Code Generation [0.0]
Large Language Models (LLMs) exhibit systematic errors on complex, multi-step programming tasks.<n>We introduce Anka, a domain-specific language () for data transformation pipelines designed with explicit, constrained syntax.<n>Anka achieves 99.9% parse success and 95.8% overall task accuracy across 100 benchmark problems.
arXiv Detail & Related papers (2025-12-29T05:28:17Z)
A Multi-Language Object-Oriented Programming Benchmark for Large Language Models [61.267115598083315]
A survey of 35 existing benchmarks uncovers three major imbalances.<n>85.7% focus on a single programming language.<n>94.3% target only function-level or statement-level tasks.<n>Over 80% include fewer than ten test cases on average.
arXiv Detail & Related papers (2025-09-30T11:30:08Z)
DS@GT at CheckThat! 2025: A Simple Retrieval-First, LLM-Backed Framework for Claim Normalization [41.99844472131922]
Claim normalization is an integral part of any automatic fact-check verification system.<n>The CheckThat! 2025 Task 2 focuses specifically on claim normalization and spans 20 languages.<n>Our proposed solution consists of a lightweight emphretrieval-first, LLM-backed pipeline.
arXiv Detail & Related papers (2025-08-24T15:19:58Z)
SLRTP2025 Sign Language Production Challenge: Methodology, Results, and Future Work [87.9341538630949]
The first Sign Language Production Challenge was held as part of the third SLRTP Workshop at CVPR 2025.<n>The competition's aims are to evaluate architectures that translate from spoken language sentences to a sequence of skeleton poses.<n>This paper presents the challenge design and the winning methodologies.
arXiv Detail & Related papers (2025-08-09T11:57:33Z)
PolyPrompt: Automating Knowledge Extraction from Multilingual Language Models with Dynamic Prompt Generation [0.0]
We introduce PolyPrompt, a novel, parameter-efficient framework for enhancing the multilingual capabilities of large language models (LLMs)<n>Our method learns a set of trigger tokens for each language through a gradient-based search, identifying the input query's language and selecting the corresponding trigger tokens which are prepended to the prompt during inference.<n>We perform experiments on two 1 billion parameter models, with evaluations on the global MMLU benchmark across fifteen typologically and resource diverse languages, demonstrating accuracy gains of 3.7%-19.9% compared to naive and translation-pipeline baselines.
arXiv Detail & Related papers (2025-02-27T04:41:22Z)
Franken-Adapter: Cross-Lingual Adaptation of LLMs by Embedding Surgery [31.516243610548635]
We present $textitFranken-Adapter$, a modular language adaptation approach for decoder-only Large Language Models.<n>Our method begins by creating customized vocabularies for target languages and performing language adaptation through embedding tuning on multilingual data.<n>Experiments on $ttGemma2$ models with up to 27B parameters demonstrate improvements of up to 20% across 96 languages, spanning both discriminative and generative tasks.
arXiv Detail & Related papers (2025-02-12T00:38:11Z)
Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model [66.17354128553244]
Most Large Vision-Language Models (LVLMs) to date are trained predominantly on English data.<n>We investigate how different training mixes tip the scale for different groups of languages.<n>We train Centurio, a 100-language LVLM, offering state-of-the-art performance in an evaluation covering 14 tasks and 56 languages.
arXiv Detail & Related papers (2025-01-09T10:26:14Z)
Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions Following [51.18383180774354]
We introduce Multi-IF, a new benchmark designed to assess Large Language Models' proficiency in following multi-turn and multilingual instructions. Our evaluation of 14 state-of-the-art LLMs on Multi-IF reveals that it presents a significantly more challenging task than existing benchmarks. languages with non-Latin scripts (Hindi, Russian, and Chinese) generally exhibit higher error rates, suggesting potential limitations in the models' multilingual capabilities.
arXiv Detail & Related papers (2024-10-21T00:59:47Z)
Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks. We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset. To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z)
MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages [54.002969723086075]
We evaluate cross-lingual open-retrieval question answering systems in 16 typologically diverse languages. The best system leveraging iteratively mined diverse negative examples achieves 32.2 F1, outperforming our baseline by 4.5 points. The second best system uses entity-aware contextualized representations for document retrieval, and achieves significant improvements in Tamil (20.8 F1), whereas most of the other systems yield nearly zero scores.
arXiv Detail & Related papers (2022-07-02T06:54:10Z)
ANDES at SemEval-2020 Task 12: A jointly-trained BERT multilingual model for offensive language detection [0.6445605125467572]
We jointly-trained a single model by fine-tuning Multilingual BERT to tackle the task across all the proposed languages. Our single model had competitive results, with a performance close to top-performing systems.
arXiv Detail & Related papers (2020-08-13T16:07:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.