Related papers: AnomaLLMy -- Detecting anomalous tokens in black-box LLMs through low-confidence single-token predictions

AnomaLLMy -- Detecting anomalous tokens in black-box LLMs through low-confidence single-token predictions

URL: http://arxiv.org/abs/2406.19840v1
Date: Fri, 28 Jun 2024 11:28:44 GMT
Title: AnomaLLMy -- Detecting anomalous tokens in black-box LLMs through low-confidence single-token predictions
Authors: Waligóra Witold,
Abstract summary: AnomaLLMy is a novel technique for the automatic detection of anomalous tokens in black-box Large Language Models. AnomaLLMy identifies irregularities in model behavior, addressing the issue of anomalous tokens degrading the quality and reliability of models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces AnomaLLMy, a novel technique for the automatic detection of anomalous tokens in black-box Large Language Models (LLMs) with API-only access. Utilizing low-confidence single-token predictions as a cost-effective indicator, AnomaLLMy identifies irregularities in model behavior, addressing the issue of anomalous tokens degrading the quality and reliability of models. Validated on the cl100k_base dataset, the token set of GPT-4, AnomaLLMy detected 413 major and 65 minor anomalies, demonstrating the method's efficiency with just \$24.39 spent in API credits. The insights from this research are expected to be beneficial for enhancing the robustness of and accuracy of LLMs, particularly in the development and assessment of tokenizers.

Related papers

Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [60.881609323604685]
Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge. Users pay for services based on advertised model capabilities. providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs. This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
Demystifying Singular Defects in Large Language Models [61.98878352956125]
In large language models (LLMs), the underlying causes of high-norm tokens remain largely unexplored. We provide both theoretical insights and empirical validation across a range of recent models. We showcase two practical applications of these findings: the improvement of quantization schemes and the design of LLM signatures.
arXiv Detail & Related papers (2025-02-10T20:09:16Z)
Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability [53.51560766150442]
Critical tokens are elements within reasoning trajectories that significantly influence incorrect outcomes. We present a novel framework for identifying these tokens through rollout sampling. We show that identifying and replacing critical tokens significantly improves model accuracy.
arXiv Detail & Related papers (2024-11-29T18:58:22Z)
Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness [6.229124658686219]
We develop a generic dual-channel detection paradigm that uses token cohesiveness as a plug-and-play module to improve existing zero-shot detectors. To calculate token cohesiveness, we use a few rounds of random token deletion and semantic difference measurement. Experiments with four state-of-the-art base detectors on various datasets, source models, and evaluation settings demonstrate the effectiveness and generality of the proposed approach.
arXiv Detail & Related papers (2024-09-25T13:18:57Z)
GlitchProber: Advancing Effective Detection and Mitigation of Glitch Tokens in Large Language Models [17.633722815221983]
Large language models (LLMs) have achieved unprecedented success in the field of natural language processing. Recent research has discovered a class of abnormal tokens in the model's vocabulary space and named them "glitch tokens" In this work, we aim to enhance the understanding of glitch tokens and propose techniques for their detection and mitigation.
arXiv Detail & Related papers (2024-08-09T07:19:53Z)
DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation [18.77296551727931]
We propose DECIDER, a novel approach that leverages priors from large language models (LLMs) and vision-language models (VLMs) to detect failures in image models. DECIDER consistently achieves state-of-the-art failure detection performance, significantly outperforming baselines in terms of the overall Matthews correlation coefficient.
arXiv Detail & Related papers (2024-08-01T07:08:11Z)
Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses. Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives. The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z)
Anomaly Detection of Tabular Data Using LLMs [54.470648484612866]
We show that pre-trained large language models (LLMs) are zero-shot batch-level anomaly detectors. We propose an end-to-end fine-tuning strategy to bring out the potential of LLMs in detecting real anomalies.
arXiv Detail & Related papers (2024-06-24T04:17:03Z)
Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings. Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z)
TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction [61.295716741720284]
TokenUnify is a novel pretraining method that integrates random token prediction, next-token prediction, and next-all token prediction. Cooperated with TokenUnify, we have assembled a large-scale electron microscopy (EM) image dataset with ultra-high resolution. This dataset includes over 120 million annotated voxels, making it the largest neuron segmentation dataset to date.
arXiv Detail & Related papers (2024-05-27T05:45:51Z)
Feature Selection with Annealing for Forecasting Financial Time Series [2.44755919161855]
This study provides a comprehensive method for forecasting financial time series based on tactical input output feature mapping techniques using machine learning (ML) models. Experiments indicate that the FSA algorithm increased the performance of ML models, regardless of problem type.
arXiv Detail & Related papers (2023-03-03T21:33:38Z)
Sparse Conditional Hidden Markov Model for Weakly Supervised Named Entity Recognition [68.68300358332156]
We propose the sparse conditional hidden Markov model (Sparse-CHMM) to evaluate noisy labeling functions. Sparse-CHMM is optimized through unsupervised learning with a three-stage training pipeline. It achieves a 3.01 average F1 score improvement on five comprehensive datasets.
arXiv Detail & Related papers (2022-05-27T20:47:30Z)
BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition [57.2201011783393]
conditional hidden Markov model (CHMM) CHMM predicts token-wise transition and emission probabilities from the BERT embeddings of the input tokens. It fine-tunes a BERT-based NER model with the labels inferred by CHMM.
arXiv Detail & Related papers (2021-05-26T21:18:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.