Related papers: Natural Fingerprints of Large Language Models

Related papers

Understanding the Implicit Biases of Design Choices for Time Series Foundation Models [90.894232610821]
Time series foundation models (TSFMs) are a class of potentially powerful, general-purpose tools for time series forecasting and related temporal tasks.<n>Their behavior is strongly shaped by subtle inductive biases in their design.<n>We show how these biases can be intuitive or very counterintuitive, depending on properties of the model and data.
arXiv Detail & Related papers (2025-10-22T04:42:35Z)
Large Language Model Sourcing: A Survey [84.63438376832471]
Large language models (LLMs) have revolutionized artificial intelligence, shifting from supporting objective tasks to empowering subjective decision-making.<n>Due to the black-box nature of LLMs and the human-like quality of their generated content, issues such as hallucinations, bias, unfairness, and copyright infringement become significant.<n>This survey presents a systematic investigation into provenance tracking for content generated by LLMs, organized around four interrelated dimensions.
arXiv Detail & Related papers (2025-10-11T10:52:30Z)
SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From [65.75182441010327]
We propose a stronger and more intrinsic notion of LLM fingerprinting: SeedPrints.<n>We show that untrained models exhibit reproducible token selection biases conditioned solely on their parameters.<n> Experiments on LLaMA-style and Qwen-style models show that SeedPrints achieves seed-level distinguishability and can provide birth-to-lifecycle identity verification akin to a biometric fingerprint.
arXiv Detail & Related papers (2025-09-30T15:34:08Z)
Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs [51.00909549291524]
Large language models (LLMs) exhibit cognitive biases.<n>These biases vary across models and can be amplified by instruction tuning.<n>It remains unclear if these differences in biases stem from pretraining, finetuning, or even random noise.
arXiv Detail & Related papers (2025-07-09T18:01:14Z)
Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling [56.26834106704781]
Factual incorrectness in generated content is one of the primary concerns in ubiquitous deployment of large language models (LLMs)<n>We provide evidence supporting the presence of LLMs' internal compass that dictate the correctness of factual recall at the time of generation.<n>Scaling experiments across model sizes and training dynamics highlight that self-awareness emerges rapidly during training and peaks in intermediate layers.
arXiv Detail & Related papers (2025-05-27T16:24:02Z)
How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence [52.9442657690445]
Post-training is essential for the success of large language models (LLMs)<n>We compare base and post-trained LLMs from four perspectives to better understand post-training effects.
arXiv Detail & Related papers (2025-04-03T06:30:55Z)
I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? [79.01538178959726]
Large language models (LLMs) have led many to conclude that they exhibit a form of intelligence. We introduce a novel generative model that generates tokens on the basis of human interpretable concepts represented as latent discrete variables.
arXiv Detail & Related papers (2025-03-12T01:21:17Z)
Unnatural Languages Are Not Bugs but Features for LLMs [92.8332103170009]
Large Language Models (LLMs) have been observed to process non-human-readable text sequences, such as jailbreak prompts.<n>We present a systematic investigation challenging this perception, demonstrating that unnatural languages contain latent features usable by models.
arXiv Detail & Related papers (2025-03-02T12:10:17Z)
Deterministic or probabilistic? The psychology of LLMs as random number generators [0.0]
Large Language Models (LLMs) have transformed text generation through inherently probabilistic context-aware mechanisms.<n>Our results reveal that, despite their transformers-based architecture, these models often exhibit deterministic responses when prompted for random numerical outputs.
arXiv Detail & Related papers (2025-02-27T10:45:27Z)
DBR: Divergence-Based Regularization for Debiasing Natural Language Understanding Models [50.54264918467997]
Pre-trained language models (PLMs) have achieved impressive results on various natural language processing tasks.<n>Recent research has revealed that these models often rely on superficial features and shortcuts instead of developing a genuine understanding of language.<n>We propose Divergence Based Regularization (DBR) to mitigate this shortcut learning behavior.
arXiv Detail & Related papers (2025-02-25T16:44:10Z)
Predicting the Performance of Black-box LLMs through Self-Queries [60.87193950962585]
Large language models (LLMs) are increasingly relied on in AI systems, predicting when they make mistakes is crucial. In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations. We demonstrate that training a linear model on these low-dimensional representations produces reliable predictors of model performance at the instance level.
arXiv Detail & Related papers (2025-01-02T22:26:54Z)
Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models [36.983534612895156]
In the recent past, a popular way of evaluating natural language understanding (NLU) was to consider a model's ability to perform natural language inference (NLI) tasks. This paper focuses on five different NLI benchmarks across six models of different scales. We investigate if they are able to discriminate models of different size and quality and how their accuracies develop during training.
arXiv Detail & Related papers (2024-11-21T13:09:36Z)
Distinguishing the Knowable from the Unknowable with Language Models [15.471748481627143]
In the absence of ground-truth probabilities, we explore a setting where, in order to disentangle a given uncertainty, a significantly larger model stands in as a proxy for the ground truth. We show that small linear probes trained on the embeddings of frozen, pretrained models accurately predict when larger models will be more confident at the token level. We propose a fully unsupervised method that achieves non-trivial accuracy on the same task.
arXiv Detail & Related papers (2024-02-05T22:22:49Z)
HuRef: HUman-REadable Fingerprint for Large Language Models [44.9820558213721]
HuRef is a human-readable fingerprint for large language models.<n>It uniquely identifies the base model without interfering with training or exposing model parameters to the public.
arXiv Detail & Related papers (2023-12-08T05:01:47Z)
In-Context Learning Dynamics with Random Binary Sequences [16.645695664776433]
We propose a framework that enables us to analyze in-context learning dynamics. Inspired by the cognitive science of human perception, we use random binary sequences as context. In the latest GPT-3.5+ models, we find emergent abilities to generate seemingly random numbers and learn basic formal languages.
arXiv Detail & Related papers (2023-10-26T17:54:52Z)
Do pretrained Transformers Learn In-Context by Gradient Descent? [21.23795112800977]
In this paper, we investigate the emergence of In-Context Learning (ICL) in language models pre-trained on natural data (LLaMa-7B) We find that ICL and Gradient Descent (GD) modify the output distribution of language models differently. These results indicate that emphthe equivalence between ICL and GD remains an open hypothesis and calls for further studies.
arXiv Detail & Related papers (2023-10-12T17:32:09Z)
Personality Traits in Large Language Models [42.31355340867784]
Personality is a key factor determining the effectiveness of communication.<n>We present a novel and comprehensive psychometrically valid and reliable methodology for administering and validating personality tests on widely-used large language models.<n>We discuss the application and ethical implications of the measurement and shaping method, in particular regarding responsible AI.
arXiv Detail & Related papers (2023-07-01T00:58:51Z)
Can Offline Reinforcement Learning Help Natural Language Understanding? [31.788133426611587]
We consider investigating the potential connection between offline reinforcement learning (RL) and language modeling (LM) RL and LM are similar in predicting the next states based on the current and previous states, which rely on both local and long-range dependency across states. Experimental results show that our RL pre-trained models can give close performance compared with the models using the LM training objective.
arXiv Detail & Related papers (2022-09-15T02:55:10Z)
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets [74.11825654535895]
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to achieve exceptional downstream performance. We study what specific traits in the pre-training data, other than the semantics, make a pre-trained LM superior to their counterparts trained from scratch on downstream tasks.
arXiv Detail & Related papers (2021-09-08T10:39:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.