FFT: Towards Harmlessness Evaluation and Analysis for LLMs with
Factuality, Fairness, Toxicity
- URL: http://arxiv.org/abs/2311.18580v1
- Date: Thu, 30 Nov 2023 14:18:47 GMT
- Title: FFT: Towards Harmlessness Evaluation and Analysis for LLMs with
Factuality, Fairness, Toxicity
- Authors: Shiyao Cui, Zhenyu Zhang, Yilong Chen, Wenyuan Zhang, Tianyun Liu,
Siqi Wang, Tingwen Liu
- Abstract summary: The widespread of generative artificial intelligence has heightened concerns about the potential harms posed by AI-generated texts.
Previous researchers have invested much effort in assessing the harmlessness of generative language models.
- Score: 21.539026782010573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The widespread of generative artificial intelligence has heightened concerns
about the potential harms posed by AI-generated texts, primarily stemming from
factoid, unfair, and toxic content. Previous researchers have invested much
effort in assessing the harmlessness of generative language models. However,
existing benchmarks are struggling in the era of large language models (LLMs),
due to the stronger language generation and instruction following capabilities,
as well as wider applications. In this paper, we propose FFT, a new benchmark
with 2116 elaborated-designed instances, for LLM harmlessness evaluation with
factuality, fairness, and toxicity. To investigate the potential harms of LLMs,
we evaluate 9 representative LLMs covering various parameter scales, training
stages, and creators. Experiments show that the harmlessness of LLMs is still
under-satisfactory, and extensive analysis derives some insightful findings
that could inspire future research for harmless LLM research.
Related papers
- Testing and Evaluation of Large Language Models: Correctness, Non-Toxicity, and Fairness [30.632260870411177]
Large language models (LLMs) have rapidly penetrated into people's work and daily lives over the past few years.
This thesis focuses on the correctness, non-toxicity, and fairness of LLMs from both software testing and natural language processing perspectives.
arXiv Detail & Related papers (2024-08-31T22:21:04Z) - Understanding Privacy Risks of Embeddings Induced by Large Language Models [75.96257812857554]
Large language models show early signs of artificial general intelligence but struggle with hallucinations.
One promising solution is to store external knowledge as embeddings, aiding LLMs in retrieval-augmented generation.
Recent studies experimentally showed that the original text can be partially reconstructed from text embeddings by pre-trained language models.
arXiv Detail & Related papers (2024-04-25T13:10:48Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Are Large Language Models Reliable Judges? A Study on the Factuality
Evaluation Capabilities of LLMs [8.526956860672698]
Large Language Models (LLMs) have gained immense attention due to their notable emergent capabilities.
This study investigates the potential of LLMs as reliable assessors of factual consistency in summaries generated by text-generation models.
arXiv Detail & Related papers (2023-11-01T17:42:45Z) - Survey on Factuality in Large Language Models: Knowledge, Retrieval and
Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs)
As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z) - FELM: Benchmarking Factuality Evaluation of Large Language Models [40.78878196872095]
We introduce a benchmark for Factuality Evaluation of large Language Models, referred to as felm.
We collect responses generated from large language models and annotate factuality labels in a fine-grained manner.
Our findings reveal that while retrieval aids factuality evaluation, current LLMs are far from satisfactory to faithfully detect factual errors.
arXiv Detail & Related papers (2023-10-01T17:37:31Z) - Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools.
Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions.
Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z) - Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and
Toxicity [19.94836502156002]
Large language models (LLMs) may exhibit social prejudice and toxicity, posing ethical and societal dangers of consequences resulting from irresponsibility.
We empirically benchmark ChatGPT on multiple sample datasets.
We find that a significant number of ethical risks cannot be addressed by existing benchmarks.
arXiv Detail & Related papers (2023-01-30T13:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.