Related papers: I'm Spartacus, No, I'm Spartacus: Measuring and Understanding LLM Identity Confusion

I'm Spartacus, No, I'm Spartacus: Measuring and Understanding LLM Identity Confusion

URL: http://arxiv.org/abs/2411.10683v1
Date: Sat, 16 Nov 2024 03:20:39 GMT
Title: I'm Spartacus, No, I'm Spartacus: Measuring and Understanding LLM Identity Confusion
Authors: Kun Li, Shichao Zhuang, Yue Zhang, Minghui Xu, Ruoxi Wang, Kaidi Xu, Xinwen Fu, Xiuzhen Cheng,
Abstract summary: Large Language Models (LLMs) excel in diverse tasks such as text generation, data analysis, and software development. However, the rapid proliferation of LLMs has raised concerns about their originality and trustworthiness. This study systematically examines identity confusion through three research questions.
Score: 33.80335805399509
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) excel in diverse tasks such as text generation, data analysis, and software development, making them indispensable across domains like education, business, and creative industries. However, the rapid proliferation of LLMs (with over 560 companies developing or deploying them as of 2024) has raised concerns about their originality and trustworthiness. A notable issue, termed identity confusion, has emerged, where LLMs misrepresent their origins or identities. This study systematically examines identity confusion through three research questions: (1) How prevalent is identity confusion among LLMs? (2) Does it arise from model reuse, plagiarism, or hallucination? (3) What are the security and trust-related impacts of identity confusion? To address these, we developed an automated tool combining documentation analysis, self-identity recognition testing, and output similarity comparisons--established methods for LLM fingerprinting--and conducted a structured survey via Credamo to assess its impact on user trust. Our analysis of 27 LLMs revealed that 25.93% exhibit identity confusion. Output similarity analysis confirmed that these issues stem from hallucinations rather than replication or reuse. Survey results further highlighted that identity confusion significantly erodes trust, particularly in critical tasks like education and professional use, with declines exceeding those caused by logical errors or inconsistencies. Users attributed these failures to design flaws, incorrect training data, and perceived plagiarism, underscoring the systemic risks posed by identity confusion to LLM reliability and trustworthiness.

Related papers

IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery [61.15184885636171]
In the presence of confounding between an endogenous variable and the outcome, instrumental variables (IVs) are used to isolate the causal effect of the endogenous variable.<n>We investigate whether large language models (LLMs) can aid in this task.<n>We introduce IV Co-Scientist, a multi-agent system that proposes, critiques, and refines IVs for a given treatment-outcome pair.
arXiv Detail & Related papers (2026-02-08T12:28:29Z)
Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution [5.061421107401101]
Large language models (LLMs) have achieved impressive performance, leading to their widespread adoption as decision-support tools in resource-constrained contexts like hiring and admissions.<n>There is, however, scientific consensus that AI systems can reflect and exacerbate societal biases, raising concerns about identity-based harm when used in critical social contexts.<n>In this work, we extend single-axis fairness evaluations to examine intersectional bias, recognizing that when multiple axes of discrimination intersect, they create distinct patterns of disadvantage.
arXiv Detail & Related papers (2025-08-09T22:24:40Z)
Mirage of Mastery: Memorization Tricks LLMs into Artificially Inflated Self-Knowledge [0.0]
Existing studies treat memorization and self-knowledge deficits in LLMs as separate issues.<n>We utilize a novel framework to ascertain if LLMs genuinely learn reasoning patterns from training data.<n>Our analysis shows a noteworthy problem in generalization: LLMs draw confidence from memorized solutions to infer a higher self-knowledge.
arXiv Detail & Related papers (2025-06-23T18:01:16Z)
Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers [59.168391398830515]
We evaluate 12 pre-trained LLMs and one specialized fact-verifier, using a collection of examples from 14 fact-checking benchmarks.<n>We highlight the importance of addressing annotation errors and ambiguity in datasets.<n> frontier LLMs with few-shot in-context examples, often overlooked in previous works, achieve top-tier performance.
arXiv Detail & Related papers (2025-06-16T10:32:10Z)
Line of Duty: Evaluating LLM Self-Knowledge via Consistency in Feasibility Boundaries [0.0]
This study aims to obtain intrinsic insights into different types of LLM self-knowledge with a novel methodology. We find that even frontier models like GPT-4o and Mistral Large are not sure of their own capabilities more than 80% of the time.
arXiv Detail & Related papers (2025-03-14T10:07:07Z)
Automated Consistency Analysis of LLMs [0.1747820331822631]
Generative AI with large language models (LLMs) are being widely adopted across the industry, academia and government. One of the key challenge to the trustworthiness and reliability of LLMs is: how consistent an LLM is in its responses. This paper proposes two approaches to validate consistency: self-validation, and validation across multiple LLMs.
arXiv Detail & Related papers (2025-02-10T21:03:24Z)
Understanding the Dark Side of LLMs' Intrinsic Self-Correction [55.51468462722138]
Intrinsic self-correction was proposed to improve LLMs' responses via feedback prompts solely based on their inherent capability. Recent works show that LLMs' intrinsic self-correction fails without oracle labels as feedback prompts. We identify intrinsic self-correction can cause LLMs to waver both intermedia and final answers and lead to prompt bias on simple factual questions.
arXiv Detail & Related papers (2024-12-19T15:39:31Z)
Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-Unknown [55.91887554462312]
We investigate the factuality of long-form text generation across various large language models (LLMs) Our analysis reveals that factuality scores tend to decline in later sentences of the generated text, accompanied by a rise in the number of unsupported claims. We find a correlation between higher Self-Known scores and improved factuality, while higher Self-Unknown scores are associated with lower factuality.
arXiv Detail & Related papers (2024-11-24T22:06:26Z)
To Know or Not To Know? Analyzing Self-Consistency of Large Language Models under Ambiguity [27.10502683001428]
This paper focuses on entity type ambiguity, analyzing the proficiency and consistency of state-of-the-art LLMs in applying factual knowledge when prompted with ambiguous entities. Experiments reveal that LLMs struggle with choosing the correct entity reading, achieving an average accuracy of only 85%, and as low as 75% with underspecified prompts.
arXiv Detail & Related papers (2024-07-24T09:48:48Z)
Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses. Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives. The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z)
Direct-Inverse Prompting: Analyzing LLMs' Discriminative Capacity in Self-Improving Generation [15.184067502284007]
Even the most advanced LLMs experience uncertainty in their outputs, often producing varied results on different runs or when faced with minor changes in input. We propose and analyze three discriminative prompts: direct, inverse, and hybrid. Our insights reveal which discriminative prompt is most promising and when to use it.
arXiv Detail & Related papers (2024-06-27T02:26:47Z)
CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models [60.59638232596912]
We introduce CLAMBER, a benchmark for evaluating large language models (LLMs) Building upon the taxonomy, we construct 12K high-quality data to assess the strengths, weaknesses, and potential risks of various off-the-shelf LLMs. Our findings indicate the limited practical utility of current LLMs in identifying and clarifying ambiguous user queries.
arXiv Detail & Related papers (2024-05-20T14:34:01Z)
LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements [59.71218039095155]
Task of reading comprehension (RC) provides a primary means to assess language models' natural language understanding (NLU) capabilities. If the context aligns with the models' internal knowledge, it is hard to discern whether the models' answers stem from context comprehension or from internal information. To address this issue, we suggest to use RC on imaginary data, based on fictitious facts and entities.
arXiv Detail & Related papers (2024-04-09T13:08:56Z)
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration [39.603649838876294]
We study approaches to identify LLM knowledge gaps and abstain from answering questions when knowledge gaps are present. Motivated by their failures in self-reflection and over-reliance on held-out sets, we propose two novel approaches.
arXiv Detail & Related papers (2024-02-01T06:11:49Z)
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs) As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z)
FELM: Benchmarking Factuality Evaluation of Large Language Models [40.78878196872095]
We introduce a benchmark for Factuality Evaluation of large Language Models, referred to as felm. We collect responses generated from large language models and annotate factuality labels in a fine-grained manner. Our findings reveal that while retrieval aids factuality evaluation, current LLMs are far from satisfactory to faithfully detect factual errors.
arXiv Detail & Related papers (2023-10-01T17:37:31Z)
Assessing Hidden Risks of LLMs: An Empirical Study on Robustness, Consistency, and Credibility [37.682136465784254]
We conduct over a million queries to the mainstream large language models (LLMs) including ChatGPT, LLaMA, and OPT. We find that ChatGPT is still capable to yield the correct answer even when the input is polluted at an extreme level. We propose a novel index associated with a dataset that roughly decides the feasibility of using such data for LLM-involved evaluation.
arXiv Detail & Related papers (2023-05-15T15:44:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.