Fugu-MT 論文翻訳(概要): DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context

論文の概要: DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context

arxiv url: http://arxiv.org/abs/2509.17399v1
Date: Mon, 22 Sep 2025 06:58:02 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-23 18:58:16.246335
Title: DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context
Title（参考訳）: DIWALI - インドにおける多様性とインクルーシティ:インドにおける文化テキスト適応のためのLCMのデータセットと評価
Authors: Pramit Sahoo, Maharaj Brahma, Maunendra Sankar Desarkar,
Abstract要約: 大規模言語モデル(LLM)は様々なタスクやアプリケーションで広く使われている。文化的な知識や能力の欠如により、文化的な整合性が欠如していることが示されている。インド文化のための新しいCSIデータセットについて紹介する。
参考スコア（独自算出の注目度）: 7.582991335459645
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) are widely used in various tasks and applications. However, despite their wide capabilities, they are shown to lack cultural alignment \citep{ryan-etal-2024-unintended, alkhamissi-etal-2024-investigating} and produce biased generations \cite{naous-etal-2024-beer} due to a lack of cultural knowledge and competence. Evaluation of LLMs for cultural awareness and alignment is particularly challenging due to the lack of proper evaluation metrics and unavailability of culturally grounded datasets representing the vast complexity of cultures at the regional and sub-regional levels. Existing datasets for culture specific items (CSIs) focus primarily on concepts at the regional level and may contain false positives. To address this issue, we introduce a novel CSI dataset for Indian culture, belonging to 17 cultural facets. The dataset comprises $\sim$8k cultural concepts from 36 sub-regions. To measure the cultural competence of LLMs on a cultural text adaptation task, we evaluate the adaptations using the CSIs created, LLM as Judge, and human evaluations from diverse socio-demographic region. Furthermore, we perform quantitative analysis demonstrating selective sub-regional coverage and surface-level adaptations across all considered LLMs. Our dataset is available here: \href{https://huggingface.co/datasets/nlip/DIWALI}{https://huggingface.co/datasets/nlip/DIWALI}, project webpage\footnote{\href{https://nlip-lab.github.io/nlip/publications/diwali/}{https://nlip-lab.github.io/nlip/publications/diwali/}}, and our codebase with model outputs can be found here: \href{https://github.com/pramitsahoo/culture-evaluation}{https://github.com/pramitsahoo/culture-evaluation}.
Abstract（参考訳）: 大規模言語モデル(LLM)は様々なタスクやアプリケーションで広く使われている。しかし、その幅広い能力にもかかわらず、文化的なアライメントが欠如していることが示され、文化的な知識と能力の欠如により、文化的なアライメントが欠如していることから、アルハミシ・エタル・2024-インベスティゲーションが生まれている。文化意識・アライメントのためのLCMの評価は, 地域・地域レベルでの文化の複雑さを表わす文化基盤データセットの適切な評価基準が欠如していることから, 特に困難である。既存の文化特化項目(CSI)のデータセットは、主に地域レベルでの概念に焦点を当て、偽陽性を含む可能性がある。この問題に対処するため,インド文化のための新しいCSIデータセットを導入する。データセットは36のサブリージョンから、$\sim$8kの文化的概念で構成されている。文化テキスト適応作業におけるLCMの文化的能力を評価するために,CSIを用いた適応,LCMを判断として,そして多様な社会デミノグラフィー領域からの人的評価を行った。さらに,LLMにおける選択的部分領域被覆と表面準位適応の定量的解析を行った。当社のデータセットは、以下の通りである。 \href{https://huggingface.co/datasets/nlip/DIWALI}{https://huggingface.co/datasets/nlip/DIWALI}, project webpage\footnote{\href{https://nlip-lab.github.io/nlip/publications/diwali/}{https://nlip-lab.github.io/nlip/publications/diwali/}}。

関連論文リスト

LLMs as Cultural Archives: Cultural Commonsense Knowledge Graph Extraction [57.23766971626989]
大規模言語モデル(LLM)は、多様なWebスケールデータから学んだ豊富な文化的知識を符号化する。文化常識知識グラフ(CCKG)構築のための反復的,即時的枠組みを提案する。対象文化が英語ではない場合でも、文化知識グラフは英語でよりよく認識されている。
論文参考訳（メタデータ） (2026-01-25T20:05:04Z)
Do Large Language Models Truly Understand Cross-cultural Differences? [53.481048019144644]
我々は,大規模言語モデルの異文化間理解と推論を評価するシナリオベースのベンチマークを開発した。文化理論を基礎として、異文化の能力を9次元に分類する。データセットは連続的な拡張をサポートし、実験は他の言語への転送可能性を確認する。
論文参考訳（メタデータ） (2025-12-08T01:21:58Z)
Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World [68.19795061447044]
本稿では,アラブ世界におけるコモンセンス推論の異文化間移動について検討する。アラブ13カ国を対象とした文化基盤のコモンセンス推論データセットを用いて,軽量アライメント手法の評価を行った。以上の結果から,他国の文化特有例は12例に過ぎず,他国の文化特有例を平均10%向上させることができた。
論文参考訳（メタデータ） (2025-09-23T17:24:14Z)
CultureScope: A Dimensional Lens for Probing Cultural Understanding in LLMs [57.653830744706305]
CultureScopeは、大規模な言語モデルにおける文化的理解を評価するための、これまでで最も包括的な評価フレームワークである。文化的な氷山理論に触発されて、文化知識分類のための新しい次元スキーマを設計する。実験結果から,文化的理解を効果的に評価できることが示唆された。
論文参考訳（メタデータ） (2025-09-19T17:47:48Z)
Fluent but Foreign: Even Regional LLMs Lack Cultural Alignment [24.871503011248777]
大規模な言語モデル(LLM)は世界中で使用されているが、西洋文化の傾向を示す。我々は,6つの指標と6つのグローバルLLMを2次元(値とプラクティス)で評価する。タスク全体では、Indicモデルはグローバルモデルよりもインド標準とよく一致しない。
論文参考訳（メタデータ） (2025-05-25T01:59:23Z)
From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs [62.9861554207279]
LLM(Large Language Models)における文化的価値の適応は大きな課題である。これまでの作業は主に、World Values Survey (WVS)データを使用して、LLMをさまざまな文化的価値と整合させる。我々は,文化価値適応のためのWVSベースのトレーニングについて検討し,調査データのみに頼って文化規範を実践し,事実知識に干渉することを発見した。
論文参考訳（メタデータ） (2025-05-22T09:00:01Z)
CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
本稿では,LLMを利用した文化データ収集のためのマルチエージェント通信フレームワークであるCultureParkを紹介する。人間の信念、規範、習慣をカプセル化した高品質な異文化対話を生成する。我々はこれらのモデルを,コンテンツモデレーション,文化的アライメント,文化教育という3つの下流課題にまたがって評価する。
論文参考訳（メタデータ） (2024-05-24T01:49:02Z)
Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions [10.415002561977655]
本研究は,ホフステデの文化次元の枠組みを用いて文化的アライメントを定量化する文化アライメントテスト (Hoftede's CAT) を提案する。我々は、米国、中国、アラブ諸国といった地域の文化的側面に対して、大規模言語モデル(LLM)を定量的に評価する。その結果, LLMの文化的アライメントを定量化し, 説明的文化的次元におけるLCMの差異を明らかにすることができた。
論文参考訳（メタデータ） (2023-08-25T14:50:13Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。