Related papers: Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs

Related papers

Classifying German Language Proficiency Levels Using Large Language Models [0.24683296459020942]
This paper investigates the use of Large Language Models (LLMs) for automatically classifying German texts into different proficiency levels.<n>To support robust training and evaluation, we construct a diverse dataset by combining multiple existing CEFR-annotated corpora with synthetic data.<n>Our results show a consistent performance improvement over prior methods, highlighting the potential of LLMs for reliable and scalable CEFR classification.
arXiv Detail & Related papers (2025-12-06T16:15:45Z)
Annif at the GermEval-2025 LLMs4Subjects Task: Traditional XMTC Augmented by Efficient LLMs [0.03743146689655111]
This paper presents the Annif system in the LLMs4Subjects (Subtask 2) at GermEval-2025.<n>The task required creating subject predictions for records using large language models with a special focus on computational efficiency.
arXiv Detail & Related papers (2025-08-21T14:04:20Z)
LecEval: An Automated Metric for Multimodal Knowledge Acquisition in Multimedia Learning [58.98865450345401]
We introduce LecEval, an automated metric grounded in Mayer's Cognitive Theory of Multimedia Learning.<n>LecEval assesses effectiveness using four rubrics: Content Relevance (CR), Expressive Clarity (EC), Logical Structure (LS) and Audience Engagement (AE)<n>We curate a large-scale dataset of over 2,000 slides from more than 50 online course videos, annotated with fine-grained human ratings.
arXiv Detail & Related papers (2025-05-04T12:06:47Z)
DNB-AI-Project at SemEval-2025 Task 5: An LLM-Ensemble Approach for Automated Subject Indexing [0.0]
Our system relies on prompting a selection of LLMs with varying examples of intellectually annotated records. We map the generated keywords to the target vocabulary, aggregate the resulting subject terms to an ensemble vote and rank them as to their relevance to the record. Our system is fourth in the quantitative ranking in the all-subjects track, but the best result in the qualitative ranking conducted by subject indexing experts.
arXiv Detail & Related papers (2025-04-30T12:47:09Z)
SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog [0.0]
We present SemEval-2025 Task 5: LLMs4Subjects, a shared task on automated subject tagging for scientific and technical records in English and German using the taxonomy.<n>Participants developed systems to recommend top-k subjects, evaluated through quantitative metrics (precision, recall, F1-score) and qualitative assessments by subject specialists.
arXiv Detail & Related papers (2025-04-09T18:26:46Z)
Training Large Recommendation Models via Graph-Language Token Alignment [53.3142545812349]
We propose a novel framework to train Large Recommendation models via Graph-Language Token Alignment.<n>By aligning item and user nodes from the interaction graph with pretrained LLM tokens, GLTA effectively leverages the reasoning abilities of LLMs.<n> Furthermore, we introduce Graph-Language Logits Matching (GLLM) to optimize token alignment for end-to-end item prediction.
arXiv Detail & Related papers (2025-02-26T02:19:10Z)
MMTEB: Massive Multilingual Text Embedding Benchmark [85.18187649328792]
We introduce the Massive Multilingual Text Embedding Benchmark (MMTEB)<n>MMTEB covers over 500 quality-controlled evaluation tasks across 250+ languages.<n>We develop several highly multilingual benchmarks, which we use to evaluate a representative set of models.
arXiv Detail & Related papers (2025-02-19T10:13:43Z)
Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback [50.84142264245052]
This work introduces the Align-SLM framework to enhance the semantic understanding of textless Spoken Language Models (SLMs) Our approach generates multiple speech continuations from a given prompt and uses semantic metrics to create preference data for Direct Preference Optimization (DPO) We evaluate the framework using ZeroSpeech 2021 benchmarks for lexical and syntactic modeling, the spoken version of the StoryCloze dataset for semantic coherence, and other speech generation metrics, including the GPT4-o score and human evaluation.
arXiv Detail & Related papers (2024-11-04T06:07:53Z)
DynamicNER: A Dynamic, Multilingual, and Fine-Grained Dataset for LLM-based Named Entity Recognition [53.019885776033824]
We propose DynamicNER, the first NER dataset designed for Large Language Models (LLMs)-based methods with dynamic categorization.<n>The dataset is also multilingual and multi-granular, covering 8 languages and 155 entity types, with corpora spanning a diverse range of domains.<n>Experiments show that DynamicNER serves as a robust and effective benchmark for LLM-based NER methods.
arXiv Detail & Related papers (2024-09-17T09:32:12Z)
TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks. We propose the TasTe framework, which stands for translating through self-reflection. The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z)
LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation [67.24113079928668]
We present LexMatcher, a method for data curation driven by the coverage of senses found in bilingual dictionaries. Our approach outperforms the established baselines on the WMT2022 test sets.
arXiv Detail & Related papers (2024-06-03T15:30:36Z)
Adaptable and Reliable Text Classification using Large Language Models [7.962669028039958]
This paper introduces an adaptable and reliable text classification paradigm, which leverages Large Language Models (LLMs)<n>We evaluated the performance of several LLMs, machine learning algorithms, and neural network-based architectures on four diverse datasets.<n>It is shown that the system's performance can be further enhanced through few-shot or fine-tuning strategies.
arXiv Detail & Related papers (2024-05-17T04:05:05Z)
A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks [30.54635848057259]
This paper conducts a comprehensive evaluation of well-known and high-performing large language models (LLMs) We select English and Chinese datasets encompassing Dialogue Generation and Text Summarization. Our study reports both automatic results, accompanied by a detailed analysis.
arXiv Detail & Related papers (2024-05-16T16:56:54Z)
Data-Augmentation-Based Dialectal Adaptation for LLMs [26.72394783468532]
This report presents GMUNLP's participation to the Dialect-Copa shared task at VarDial 2024. The task focuses on evaluating the commonsense reasoning capabilities of large language models (LLMs) on South Slavic micro-dialects. We propose an approach that combines the strengths of different types of language models and leverages data augmentation techniques to improve task performance.
arXiv Detail & Related papers (2024-04-11T19:15:32Z)
On Learning to Summarize with Large Language Models as References [101.79795027550959]
Large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets. We study an LLM-as-reference learning setting for smaller text summarization models to investigate whether their performance can be substantially improved.
arXiv Detail & Related papers (2023-05-23T16:56:04Z)
Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation [127.81351683335143]
Cross-lingual pretraining requires models to align the lexical- and high-level representations of the two languages. Previous research has shown that this is because the representations are not sufficiently aligned. In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings.
arXiv Detail & Related papers (2021-03-18T21:17:58Z)
Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language. We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models. Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.