Fugu-MT 論文翻訳(概要): BRoverbs -- Measuring how much LLMs understand Portuguese proverbs

論文の概要: BRoverbs -- Measuring how much LLMs understand Portuguese proverbs

arxiv url: http://arxiv.org/abs/2509.08960v1
Date: Wed, 10 Sep 2025 19:47:46 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-12 16:52:24.125727
Title: BRoverbs -- Measuring how much LLMs understand Portuguese proverbs
Title（参考訳）: BRoverbs -- LLMがポルトガルの証明をどの程度理解しているかを測定する
Authors: Thales Sales Almeida, Giovana Kerche Bonás, João Guilherme Alves Santos,
Abstract要約: 大規模言語モデル(LLM)は、それらを適用する言語的・文化的文脈によって大きなパフォーマンス変化を示す。この格差は、特定の地域設定でその能力を評価できる成熟した評価フレームワークの必要性を示唆している。ポルトガル語の場合、既存の評価は限定的であり、しばしば言語的なニュアンスや文化的な参照を十分に捉えない翻訳されたデータセットに依存している。
参考スコア（独自算出の注目度）: 3.364554138758565
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) exhibit significant performance variations depending on the linguistic and cultural context in which they are applied. This disparity signals the necessity of mature evaluation frameworks that can assess their capabilities in specific regional settings. In the case of Portuguese, existing evaluations remain limited, often relying on translated datasets that may not fully capture linguistic nuances or cultural references. Meanwhile, native Portuguese-language datasets predominantly focus on structured national exams or sentiment analysis of social media interactions, leaving gaps in evaluating broader linguistic understanding. To address this limitation, we introduce BRoverbs, a dataset specifically designed to assess LLM performance through Brazilian proverbs. Proverbs serve as a rich linguistic resource, encapsulating cultural wisdom, figurative expressions, and complex syntactic structures that challenge the model comprehension of regional expressions. BRoverbs aims to provide a new evaluation tool for Portuguese-language LLMs, contributing to advancing regionally informed benchmarking. The benchmark is available at https://huggingface.co/datasets/Tropic-AI/BRoverbs.
Abstract（参考訳）: 大規模言語モデル(LLM)は、それらを適用する言語的・文化的文脈によって大きなパフォーマンス変化を示す。この格差は、特定の地域設定でその能力を評価できる成熟した評価フレームワークの必要性を示唆している。ポルトガル語の場合、既存の評価は限定的であり、しばしば言語的なニュアンスや文化的な参照を十分に捉えない翻訳されたデータセットに依存している。一方、ポルトガル語のデータセットは、主に構造化された国家試験やソーシャルメディアの相互作用の感情分析に焦点を合わせており、より広範な言語的理解を評価するのにギャップを残している。この制限に対処するために、ブラジルの証明を通じてLLMの性能を評価するために特別に設計されたデータセットであるBRoverbsを紹介した。プロバーブは豊かな言語資源として機能し、文化的な知恵、比喩表現、地域表現のモデル理解に挑戦する複雑な構文構造をカプセル化している。 BRoverbs はポルトガル語 LLM の新たな評価ツールを提供することを目標としている。ベンチマークはhttps://huggingface.co/datasets/Tropic-AI/BRoverbsで公開されている。

論文の概要: BRoverbs -- Measuring how much LLMs understand Portuguese proverbs

関連論文リスト