Fugu-MT 論文翻訳(概要): Missing the Margins: A Systematic Literature Review on the Demographic Representativeness of LLMs

論文の概要: Missing the Margins: A Systematic Literature Review on the Demographic Representativeness of LLMs

arxiv url: http://arxiv.org/abs/2511.01864v1
Date: Wed, 15 Oct 2025 09:11:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-09 16:58:40.040437
Title: Missing the Margins: A Systematic Literature Review on the Demographic Representativeness of LLMs
Title（参考訳）: マージンの欠落: LLMの民主的代表性に関する体系的文献レビュー
Authors: Indira Sen, Marlene Lutz, Elisa Rogers, David Garcia, Markus Strohmaier,
Abstract要約: 大規模言語モデル(LLM)の人口統計学的代表性に関する211の論文をレビューする。研究の29%はLSMの代表性について肯定的な結論を報告しているが、そのうち30%はLSMを複数のカテゴリーで評価していない。論文の3分の1以上が対象人口を定義していない。
参考スコア（独自算出の注目度）: 7.864491832722478
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Many applications of Large Language Models (LLMs) require them to either simulate people or offer personalized functionality, making the demographic representativeness of LLMs crucial for equitable utility. At the same time, we know little about the extent to which these models actually reflect the demographic attributes and behaviors of certain groups or populations, with conflicting findings in empirical research. To shed light on this debate, we review 211 papers on the demographic representativeness of LLMs. We find that while 29% of the studies report positive conclusions on the representativeness of LLMs, 30% of these do not evaluate LLMs across multiple demographic categories or within demographic subcategories. Another 35% and 47% of the papers concluding positively fail to specify these subcategories altogether for gender and race, respectively. Of the articles that do report subcategories, fewer than half include marginalized groups in their study. Finally, more than a third of the papers do not define the target population to whom their findings apply; of those that do define it either implicitly or explicitly, a large majority study only the U.S. Taken together, our findings suggest an inflated perception of LLM representativeness in the broader community. We recommend more precise evaluation methods and comprehensive documentation of demographic attributes to ensure the responsible use of LLMs for social applications. Our annotated list of papers and analysis code is publicly available.
Abstract（参考訳）: 大規模言語モデル(LLM)の多くのアプリケーションでは、人をシミュレートするか、パーソナライズした機能を提供することが求められている。同時に、これらのモデルが特定のグループや集団の人口特性や行動を実際にどの程度反映しているかはほとんど分かっていない。本研究は, LLMの人口統計学的代表性に関する211の論文を概説する。研究の29%はLSMの代表性について肯定的な結論を報告しているが、そのうち30%は複数のカテゴリー、または人口サブカテゴリでLSMを評価していない。その他の35%と47%の論文は、それぞれ性別と人種の下位区分を指定できなかった。サブカテゴリを報告している記事のうち、半分未満は研究に疎外されたグループを含んでいる。最後に,研究論文の3分の1以上は,対象とする個体群を定義していないが,その内,暗黙的あるいは明示的に定義した個体群のうち,米国のみを対象とする大半の研究では,広いコミュニティにおけるLSM代表性に対する認識が膨らんでいることを示唆している。社会アプリケーションにおけるLCMの責任ある利用を保証するため、より正確な評価手法と人口統計属性の包括的文書化を推奨する。私たちの注釈付き論文と分析コードのリストが公開されています。

論文の概要: Missing the Margins: A Systematic Literature Review on the Demographic Representativeness of LLMs

関連論文リスト