Fugu-MT 論文翻訳(概要): Sentiment Classification of Gaza War Headlines: A Comparative Analysis of Large Language Models and Arabic Fine-Tuned BERT Models

論文の概要: Sentiment Classification of Gaza War Headlines: A Comparative Analysis of Large Language Models and Arabic Fine-Tuned BERT Models

arxiv url: http://arxiv.org/abs/2604.08566v1
Date: Wed, 18 Mar 2026 16:19:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-19 19:09:11.427695
Title: Sentiment Classification of Gaza War Headlines: A Comparative Analysis of Large Language Models and Arabic Fine-Tuned BERT Models
Title（参考訳）: ガザ戦争見出しの感性分類:大言語モデルとアラビア細調整BERTモデルの比較分析
Authors: Amr Eleraqi, Hager H. Mustafa, Abdul Hadi N. Ahmed,
Abstract要約: 本研究では、異なる人工知能アーキテクチャーが、コンフリクト関連メディア談話における感情をどう解釈するかを検討する。 10,990のアラビア語ニュースの見出し(Eleraqi 2026)に基づいて、3つの大きな言語モデルと6つの微調整されたアラビア語のBERTモデルの比較分析を行っている。その結果,感情分布の顕著かつ非ランダムなばらつきが明らかになった。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This study examines how different artificial intelligence architectures interpret sentiment in conflict-related media discourse, using the 2023 Gaza War as a case study. Drawing on a corpus of 10,990 Arabic news headlines (Eleraqi 2026), the research conducts a comparative analysis between three large language models and six fine-tuned Arabic BERT models. Rather than evaluating accuracy against a single human-annotated gold standard, the study adopts an epistemological approach that treats sentiment classification as an interpretive act produced by model architectures. To quantify systematic differences across models, the analysis employs information-theoretic and distributional metrics, including Shannon Entropy, Jensen-Shannon Distance, and a Variance Score measuring deviation from aggregate model behavior. The results reveal pronounced and non-random divergence in sentiment distributions. Fine-tuned BERT models, particularly MARBERT, exhibit a strong bias toward neutral classifications, while LLMs consistently amplify negative sentiment, with LLaMA-3.1-8B showing near-total collapse into negativity. Frame-conditioned analysis further demonstrates that GPT-4.1 adjusts sentiment judgments in line with narrative frames (e.g., humanitarian, legal, security), whereas other LLMs display limited contextual modulation. These findings suggest that the choice of model constitutes a choice of interpretive lens, shaping how conflict narratives are algorithmically framed and emotionally evaluated. The study contributes to media studies and computational social science by foregrounding algorithmic discrepancy as an object of analysis and by highlighting the risks of treating automated sentiment outputs as neutral or interchangeable measures of media tone in contexts of war and crisis.
Abstract（参考訳）: 本研究では,2023年のガザ戦争を事例として,異なる人工知能アーキテクチャが紛争関連メディアの感情をどう解釈するかを検討する。 10,990のアラビア語ニュースの見出し(Eleraqi 2026)を基に、3つの大きな言語モデルと6つの微調整されたアラビア語のBERTモデルの比較分析を行っている。この研究は、人間に注釈を付けた1つの金の基準に対して正確さを評価するのではなく、モデルアーキテクチャが生み出す解釈的行為として感情分類を扱う認識論的アプローチを採用した。モデル間の系統的な差異を定量化するために、分析では、シャノン・エントロピー、ジェンセン・シャノン距離、集約モデルの振る舞いから逸脱を測定するヴァリアンススコアなど、情報理論と分布のメトリクスを用いる。その結果,感情分布の顕著かつ非ランダムなばらつきが明らかになった。細調整されたBERTモデル、特にMARBERTは中性分類に対して強い偏見を示し、LLMは一貫して負の感情を増幅し、LLaMA-3.1-8Bはほぼ完全な負の崩壊を示す。フレーム条件分析により、GPT-4.1は物語のフレーム(例えば、人道的、法的な、セキュリティ)に合わせて感情判断を調整するのに対し、他のLLMは文脈変調に制限があることを示した。これらの結果は、モデルの選択が解釈レンズの選択を構成することを示唆し、競合物語がアルゴリズム的にフレーム化され、感情的に評価される方法を形成する。この研究は、分析の対象としてアルゴリズム的不一致を予知し、自動的な感情出力を戦争や危機の文脈におけるメディアトーンの中立的あるいは交換可能な尺度として扱うリスクを強調することによって、メディア研究と計算社会科学に貢献する。

論文の概要: Sentiment Classification of Gaza War Headlines: A Comparative Analysis of Large Language Models and Arabic Fine-Tuned BERT Models

関連論文リスト