Fugu-MT 論文翻訳(概要): When Meaning Travels: A Granular Lens on Hybrid-MoE's Role in Idiomatic Understanding for Language Models

論文の概要: When Meaning Travels: A Granular Lens on Hybrid-MoE's Role in Idiomatic Understanding for Language Models

arxiv url: http://arxiv.org/abs/2606.01671v1
Date: Mon, 01 Jun 2026 04:28:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-02 21:34:29.980656
Title: When Meaning Travels: A Granular Lens on Hybrid-MoE's Role in Idiomatic Understanding for Language Models
Title（参考訳）: 旅行の意味:Hybrid-MoEの言語モデルに対する慣用的理解における役割に関するグラニュラーレンズ
Authors: Sarmistha Das, Vaibhav Vishal, Shreyas Guha, Amaan Ali, Kitsuchart Pasupa, Sriparna Saha,
Abstract要約: 本稿では, ヒンディー語, ベンガル語, タイ語などの低資源の東南アジア諸言語において, 図形的・文化的意味を保ち続ける航法を示す。このような複雑さに対処するため、3,533個の多言語イディオムからなる再構成された多モーダルコーパスであるVarnikaを提案する。
参考スコア（独自算出の注目度）: 11.355899871129559
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the contemporary epoch of multilingual education, learning idioms provides a fascinating gateway towards creativity, cultural values, historical context, and diverse perspectives inherent to various linguistic traditions. This paper showcases the navigation of retaining figurative and cultural semantics in low-resource Southeast Asian languages such as Hindi, Bengali, and Thai, where culturally rich idioms pose significant obstacles for computational modeling and cross-linguistic transfer due to their deep metaphorical complexity. To tackle such complexity, we present Varnika, a reconstructed multimodal idiom corpus comprising 3,533 multilingual idioms, enriched with seven idiomatic tones aligned with both textual and visual representations. Additionally, to infer informative idiomatic understanding, we introduce a Hybrid Mixture-of-Experts (HybridMoE) framework that embeds multiple idiomatic expert opinions while mitigating expert sparsity by integrating outputs from both selected and unselected experts through controlled hybridization, further augmented with Idiomatic Property Signals via masked multimodal embeddings. To analyze the performance across multiple dimensions, we propose the IDIO-TONE and Idiomatic Validation Score, a three-stage evaluation pipeline measuring (i) literal translation fidelity, (ii) visual-semantic alignment, and (iii) idiomatic meaning retention. Empirical evaluations highlight that HybridMoE achieves 5--6\% performance gains across advanced vision language models, demonstrating improved representation of figurative language and culturally embedded meaning in multilingual multimodal settings
Abstract（参考訳）: 現代多言語教育の時代において、学習のイディオムは、創造性、文化的価値、歴史的文脈、そして様々な言語伝統に固有の多様な視点への魅力的な入り口を提供する。本稿では,ヒンディー語,ベンガル語,タイ語などの低資源の東南アジア諸言語における図形的・文化的な意味を保ち続ける航法を紹介する。そこで本研究では,3,533個の多言語イディオムからなる再構成された多言語イディオムコーパスであるVarnikaについて述べる。さらに、情報的慣用的理解を推し進めるために、複数の慣用的専門家の意見を埋め込んだHybridMoE(HybridMixture-of-Experts)フレームワークを導入する。複数次元にまたがる性能を解析するために,3段階評価パイプラインであるIDIO-TONE と Idiomatic Validation Score を提案する。 (i)文字通りの翻訳の忠実さ (二)視覚・意味的アライメント、及び (三)慣用的な意味での保持経験的評価では、HybridMoEは先進視覚言語モデル全体で5～66%の性能向上を達成し、多言語マルチモーダル設定における図形言語と文化的に埋め込まれた意味の表現の改善を実証している。

関連論文リスト

A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding [15.171586338601522]
潜在的に慣用的な表現(PIE)は、言語コミュニティの日常的な経験と本質的に結びついている意味を解釈する。 XMPIEは,潜在的慣用的表現の並列多言語・マルチモーダルデータセットである。
論文参考訳（メタデータ） (2026-01-13T15:20:28Z)
M3DR: Towards Universal Multilingual Multimodal Document Retrieval [0.0]
M3DR(Multilingual Multimodal Document Retrieval)は,言語間のギャップを埋めるためのフレームワークである。異なる視覚言語アーキテクチャとモデルサイズにまたがって一般化し、堅牢な言語間およびモーダル間のアライメントを可能にします。我々のモデルであるNetraEmbedとColNetraEmbedは、言語間検索における150%の相対的な改善で最先端のパフォーマンスを実現しています。
論文参考訳（メタデータ） (2025-12-03T07:17:59Z)
MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded Evaluation [91.22008265721952]
MMA-ASIAは、アジア8か国と10か国を対象とする人為的、多言語的、マルチモーダルなベンチマークに重点を置いている。これは、テキスト、画像(視覚的質問応答)、音声の3つのモードにまたがる入力レベルで整列された最初のデータセットである。 i) 国間の文化的認識格差、(ii) 言語間の整合性、(iii) 言語間の整合性、(iv) 文化知識の一般化、(v) 基礎的妥当性を評価する5次元評価プロトコルを提案する。
論文参考訳（メタデータ） (2025-10-07T14:12:12Z)
Decoding Memes: Benchmarking Narrative Role Classification across Multilingual and Multimodal Models [26.91963265869296]
本研究は,インターネットミームにおける物語的役割の特定という課題について考察する。元々は'他'クラスにスキューされたアノテーション付きデータセットの上に構築される。包括的語彙および構造解析は、実際のミームで使われるニュアンス、文化特化、文脈に富んだ言語を強調している。
論文参考訳（メタデータ） (2025-06-29T07:12:11Z)
JiraiBench: A Bilingual Benchmark for Evaluating Large Language Models' Detection of Human Self-Destructive Behavior Content in Jirai Community [9.492476871323763]
本稿では,大規模言語モデルによる自己破壊的コンテンツ検出の有効性を評価するための,最初のバイリンガルベンチマークである JiraiBench を紹介する。我々は,薬物過剰摂取,摂食障害,自傷など多種の自己破壊行動を含む,全国的な地雷オンラインサブカルチャーに注目した。本データセットは,3つの行動カテゴリーに沿って,多次元アノテーションを用いた10,419の中国語投稿と5000の日本語投稿からなる。
論文参考訳（メタデータ） (2025-03-27T16:48:58Z)
Multi-lingual and Multi-cultural Figurative Language Understanding [69.47641938200817]
図形言語は人間のコミュニケーションに浸透するが、NLPでは比較的過小評価されている。 Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili, Yorubaの7つの多様な言語に関するデータセットを作成しました。我々のデータセットから,各言語は,同じ領域から派生した言語間で最も高い重なり合いを持つ,図形表現の文化的・地域的概念に依存していることが明らかとなった。全ての言語は、事前学習データと微調整データの可用性を反映した性能の変化により、英語と比較して大きな欠陥がある。
論文参考訳（メタデータ） (2023-05-25T15:30:31Z)
LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation [94.33019040320507]
マルチモーダル機械翻訳(MMT)は、視覚的特徴を持つテキストのみの翻訳を強化することに焦点を当てている。最近の進歩は、各言語ペアごとに別々のモデルをトレーニングすることに苦慮している。 7つの言語をカバーする2つのMultilingual MMTベンチマークデータセットを確立することで,Multilingual MMTタスクを提案する。
論文参考訳（メタデータ） (2022-10-19T12:21:39Z)
Revamping Multilingual Agreement Bidirectionally via Switched Back-translation for Multilingual Neural Machine Translation [107.83158521848372]
マルチリンガル・コンセンサス(MA)は、マルチリンガル・ニューラル・マシン翻訳(MNMT)の重要性を示した textbfBidirectional textbfMultilingual textbfAgreement via textbfSwitched textbfBack-textbftranslation (textbfBMA-SBT) これは、訓練済みのMNMTモデルを微調整するための、新規で普遍的な多言語合意フレームワークである。
論文参考訳（メタデータ） (2022-09-28T09:14:58Z)
AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages with Adversarial Examples [51.048234591165155]
本稿では, AM2iCo, Adversarial and Multilingual Meaning in Contextを提案する。言語間文脈における単語の意味の同一性を理解するために、最先端(SotA)表現モデルを忠実に評価することを目的としている。その結果、現在のSotAプリトレーニングエンコーダは人間のパフォーマンスにかなり遅れていることが明らかとなった。
論文参考訳（メタデータ） (2021-04-17T20:23:45Z)
M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training [119.16007395162431]
M3Pは、多言語事前訓練と多言語事前訓練を組み合わせた多言語マルチモーダル事前訓練モデルである。我々は,M3Pが英語に匹敵する結果が得られることを示す。
論文参考訳（メタデータ） (2020-06-04T03:54:29Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。