Benchmarking Multi-National Value Alignment for Large Language Models
- URL: http://arxiv.org/abs/2504.12911v2
- Date: Sat, 19 Apr 2025 04:07:54 GMT
- Title: Benchmarking Multi-National Value Alignment for Large Language Models
- Authors: Weijie Shi, Chengyi Ju, Chengzhong Liu, Jiaming Ji, Jipeng Zhang, Ruiyuan Zhang, Jia Zhu, Jiajie Xu, Yaodong Yang, Sirui Han, Yike Guo,
- Abstract summary: We introduce NaVAB, a benchmark to evaluate the alignment of Large Language Models with the values of five major nations.<n>NaVAB implements a national value extraction pipeline to efficiently construct value assessment datasets.<n>We conduct extensive experiments on various LLMs across countries, and the results provide insights into assisting in the identification of misaligned scenarios.
- Score: 23.378701093426546
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Do Large Language Models (LLMs) hold positions that conflict with your country's values? Occasionally they do! However, existing works primarily focus on ethical reviews, failing to capture the diversity of national values, which encompass broader policy, legal, and moral considerations. Furthermore, current benchmarks that rely on spectrum tests using manually designed questionnaires are not easily scalable. To address these limitations, we introduce NaVAB, a comprehensive benchmark to evaluate the alignment of LLMs with the values of five major nations: China, the United States, the United Kingdom, France, and Germany. NaVAB implements a national value extraction pipeline to efficiently construct value assessment datasets. Specifically, we propose a modeling procedure with instruction tagging to process raw data sources, a screening process to filter value-related topics and a generation process with a Conflict Reduction mechanism to filter non-conflicting values.We conduct extensive experiments on various LLMs across countries, and the results provide insights into assisting in the identification of misaligned scenarios. Moreover, we demonstrate that NaVAB can be combined with alignment techniques to effectively reduce value concerns by aligning LLMs' values with the target country.
Related papers
- IberBench: LLM Evaluation on Iberian Languages [2.3034630097498883]
Large Language Models (LLMs) are difficult to evaluate comprehensively, particularly for languages other than English.
We present IberBench, a benchmark designed to assess LLM performance on both fundamental and industry-relevant NLP tasks.
We evaluate 23 LLMs ranging from 100 million to 14 billion parameters and provide empirical insights into their strengths and limitations.
arXiv Detail & Related papers (2025-04-23T17:48:25Z) - DICE: A Framework for Dimensional and Contextual Evaluation of Language Models [1.534667887016089]
Language models (LMs) are increasingly being integrated into a wide range of applications.<n>Current evaluations rely on benchmarks that often lack direct applicability to the real-world contexts in which LMs are being deployed.<n>We propose Dimensional and Contextual Evaluation (DICE), an approach that evaluates LMs on granular, context-dependent dimensions.
arXiv Detail & Related papers (2025-04-14T16:08:13Z) - Value Compass Leaderboard: A Platform for Fundamental and Validated Evaluation of LLMs Values [76.70893269183684]
Large Language Models (LLMs) achieve remarkable breakthroughs, aligning their values with humans has become imperative.<n>Existing evaluations focus narrowly on safety risks such as bias and toxicity.<n>Existing benchmarks are prone to data contamination.<n>The pluralistic nature of human values across individuals and cultures is largely ignored in measuring LLMs value alignment.
arXiv Detail & Related papers (2025-01-13T05:53:56Z) - MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models [71.36392373876505]
We introduce MMIE, a large-scale benchmark for evaluating interleaved multimodal comprehension and generation in Large Vision-Language Models (LVLMs)<n>MMIE comprises 20K meticulously curated multimodal queries, spanning 3 categories, 12 fields, and 102 subfields, including mathematics, coding, physics, literature, health, and arts.<n>It supports both interleaved inputs and outputs, offering a mix of multiple-choice and open-ended question formats to evaluate diverse competencies.
arXiv Detail & Related papers (2024-10-14T04:15:00Z) - LocalValueBench: A Collaboratively Built and Extensible Benchmark for Evaluating Localized Value Alignment and Ethical Safety in Large Language Models [0.0]
The proliferation of large language models (LLMs) requires robust evaluation of their alignment with local values and ethical standards.
textscLocalValueBench is a benchmark designed to assess LLMs' adherence to Australian values.
arXiv Detail & Related papers (2024-07-27T05:55:42Z) - LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models [71.8065384742686]
LMMS-EVAL is a unified and standardized multimodal benchmark framework with over 50 tasks and more than 10 models.
LMMS-EVAL LITE is a pruned evaluation toolkit that emphasizes both coverage and efficiency.
Multimodal LIVEBENCH utilizes continuously updating news and online forums to assess models' generalization abilities in the wild.
arXiv Detail & Related papers (2024-07-17T17:51:53Z) - ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models [14.268555410234804]
Large Language Models (LLMs) are transforming diverse fields and gaining increasing influence as human proxies.
This work introduces ValueBench, the first comprehensive psychometric benchmark for evaluating value orientations and value understanding in LLMs.
arXiv Detail & Related papers (2024-06-06T16:14:16Z) - Self-Evaluation Improves Selective Generation in Large Language Models [54.003992911447696]
We reformulate open-ended generation tasks into token-level prediction tasks.
We instruct an LLM to self-evaluate its answers.
We benchmark a range of scoring methods based on self-evaluation.
arXiv Detail & Related papers (2023-12-14T19:09:22Z) - AlignBench: Benchmarking Chinese Alignment of Large Language Models [99.24597941555277]
We introduce AlignBench, a comprehensive benchmark for evaluating Chinese Large Language Models' alignment.
We design a human-in-the-loop data curation pipeline, containing eight main categories, 683 real-scenario rooted queries and corresponding human verified references.
For automatic evaluation, our benchmark employs a rule-calibrated multi-dimensional LLM-as-Judgecitezheng2023judging approach with Chain-of-Thought to generate explanations and final ratings.
arXiv Detail & Related papers (2023-11-30T17:41:30Z) - CLEVA: Chinese Language Models EVAluation Platform [92.42981537317817]
We present CLEVA, a user-friendly platform crafted to holistically evaluate Chinese LLMs.
Our platform employs a standardized workflow to assess LLMs' performance across various dimensions, regularly updating a competitive leaderboard.
To alleviate contamination, CLEVA curates a significant proportion of new data and develops a sampling strategy that guarantees a unique subset for each leaderboard round.
Empowered by an easy-to-use interface that requires just a few mouse clicks and a model API, users can conduct a thorough evaluation with minimal coding.
arXiv Detail & Related papers (2023-08-09T09:11:31Z) - CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity
and Infant Care [14.326936563564171]
We present a benchmark, CARE-MI, for evaluating misinformation in large language models (LLMs)
Our proposed benchmark fills the gap between the extensive usage of LLMs and the lack of datasets for assessing the misinformation generated by these models.
Using our benchmark, we conduct extensive experiments and found that current Chinese LLMs are far from perfect in the topic of maternity and infant care.
arXiv Detail & Related papers (2023-07-04T03:34:19Z) - Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans.
We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.