Decoding the Mind of Large Language Models: A Quantitative Evaluation of Ideology and Biases
- URL: http://arxiv.org/abs/2505.12183v1
- Date: Sun, 18 May 2025 00:52:06 GMT
- Title: Decoding the Mind of Large Language Models: A Quantitative Evaluation of Ideology and Biases
- Authors: Manari Hirose, Masato Uchida,
- Abstract summary: We propose a novel framework for evaluating Large Language Models (LLMs)<n>By applying our framework to ChatGPT and Gemini, findings revealed that while LLMs generally maintain consistent opinions on many topics, their ideologies differ across models and languages.<n>Both models also exhibited problematic biases, unethical or unfair claims, which might have negative societal impacts.
- Score: 0.276240219662896
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The widespread integration of Large Language Models (LLMs) across various sectors has highlighted the need for empirical research to understand their biases, thought patterns, and societal implications to ensure ethical and effective use. In this study, we propose a novel framework for evaluating LLMs, focusing on uncovering their ideological biases through a quantitative analysis of 436 binary-choice questions, many of which have no definitive answer. By applying our framework to ChatGPT and Gemini, findings revealed that while LLMs generally maintain consistent opinions on many topics, their ideologies differ across models and languages. Notably, ChatGPT exhibits a tendency to change their opinion to match the questioner's opinion. Both models also exhibited problematic biases, unethical or unfair claims, which might have negative societal impacts. These results underscore the importance of addressing both ideological and ethical considerations when evaluating LLMs. The proposed framework offers a flexible, quantitative method for assessing LLM behavior, providing valuable insights for the development of more socially aligned AI systems.
Related papers
- Are LLMs (Really) Ideological? An IRT-based Analysis and Alignment Tool for Perceived Socio-Economic Bias in LLMs [0.0]
We introduce an Item Response Theory (IRT)-based framework to detect and quantify socioeconomic bias in large language models (LLMs)<n>IRT accounts for item difficulty, improving ideological bias estimation.<n>This empirically validated framework enhances AI alignment research and promotes fairer AI governance.
arXiv Detail & Related papers (2025-03-17T13:20:09Z) - Normative Evaluation of Large Language Models with Everyday Moral Dilemmas [0.0]
We evaluate large language models (LLMs) on complex, everyday moral dilemmas sourced from the "Am I the Asshole" (AITA) community on Reddit.<n>Our results demonstrate that large language models exhibit distinct patterns of moral judgment, varying substantially from human evaluations on the AITA subreddit.
arXiv Detail & Related papers (2025-01-30T01:29:46Z) - Bias in Large Language Models: Origin, Evaluation, and Mitigation [4.606140332500086]
Large Language Models (LLMs) have revolutionized natural language processing, but their susceptibility to biases poses significant challenges.
This comprehensive review examines the landscape of bias in LLMs, from its origins to current mitigation strategies.
Ethical and legal implications of biased LLMs are discussed, emphasizing potential harms in real-world applications such as healthcare and criminal justice.
arXiv Detail & Related papers (2024-11-16T23:54:53Z) - Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - Diverging Preferences: When do Annotators Disagree and do Models Know? [92.24651142187989]
We develop a taxonomy of disagreement sources spanning 10 categories across four high-level classes.
We find that the majority of disagreements are in opposition with standard reward modeling approaches.
We develop methods for identifying diverging preferences to mitigate their influence on evaluation and training.
arXiv Detail & Related papers (2024-10-18T17:32:22Z) - Language Model Alignment in Multilingual Trolley Problems [138.5684081822807]
Building on the Moral Machine experiment, we develop a cross-lingual corpus of moral dilemma vignettes in over 100 languages called MultiTP.<n>Our analysis explores the alignment of 19 different LLMs with human judgments, capturing preferences across six moral dimensions.<n>We discover significant variance in alignment across languages, challenging the assumption of uniform moral reasoning in AI systems.
arXiv Detail & Related papers (2024-07-02T14:02:53Z) - Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism [62.571419297164645]
This paper provides a systematic overview of prior works on the logical reasoning ability of large language models for analyzing categorical syllogisms.<n>We first investigate all the possible variations for the categorical syllogisms from a purely logical perspective.<n>We then examine the underlying configurations (i.e., mood and figure) tested by the existing datasets.
arXiv Detail & Related papers (2024-06-26T21:17:20Z) - Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective [66.34066553400108]
We conduct a rigorous evaluation of large language models' implicit bias towards certain demographics.<n>Inspired by psychometric principles, we propose three attack approaches, i.e., Disguise, Deception, and Teaching.<n>Our methods can elicit LLMs' inner bias more effectively than competitive baselines.
arXiv Detail & Related papers (2024-06-20T06:42:08Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.<n>We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.<n>Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - MoralBench: Moral Evaluation of LLMs [34.43699121838648]
This paper introduces a novel benchmark designed to measure and compare the moral reasoning capabilities of large language models (LLMs)
We present the first comprehensive dataset specifically curated to probe the moral dimensions of LLM outputs.
Our methodology involves a multi-faceted approach, combining quantitative analysis with qualitative insights from ethics scholars to ensure a thorough evaluation of model performance.
arXiv Detail & Related papers (2024-06-06T18:15:01Z) - The Impossibility of Fair LLMs [17.812295963158714]
We analyze a variety of technical fairness frameworks and find inherent challenges in each that make the development of a fair language model intractable.<n>We show that each framework either does not extend to the general-purpose AI context or is infeasible in practice.<n>These inherent challenges would persist for general-purpose AI, including LLMs, even if empirical challenges, such as limited participatory input and limited measurement methods, were overcome.
arXiv Detail & Related papers (2024-05-28T04:36:15Z) - Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models.
We show that models give substantively different answers when not forced.
We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z) - MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation [60.65820977963331]
We introduce a novel evaluation paradigm for Large Language Models (LLMs)
This paradigm shifts the emphasis from result-oriented assessments, which often neglect the reasoning process, to a more comprehensive evaluation.
By applying this paradigm in the GSM8K dataset, we have developed the MR-GSM8K benchmark.
arXiv Detail & Related papers (2023-12-28T15:49:43Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - Post Turing: Mapping the landscape of LLM Evaluation [22.517544562890663]
This paper traces the historical trajectory of Large Language Models (LLMs) evaluations, from the foundational questions posed by Alan Turing to the modern era of AI research.
We emphasize the pressing need for a unified evaluation system, given the broader societal implications of these models.
This work serves as a call for the AI community to collaboratively address the challenges of LLM evaluation, ensuring their reliability, fairness, and societal benefit.
arXiv Detail & Related papers (2023-11-03T17:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.