Related papers: Fairness Certification for Natural Language Processing and Large Language Models

Fairness Certification for Natural Language Processing and Large Language Models

URL: http://arxiv.org/abs/2401.01262v2
Date: Wed, 3 Jan 2024 08:17:53 GMT
Title: Fairness Certification for Natural Language Processing and Large Language Models
Authors: Vincent Freiberger, Erik Buchmann
Abstract summary: We follow a qualitative research approach towards a fairness certification for NLP approaches. We have systematically devised six fairness criteria for NLP, which can be further refined into 18 sub-categories.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Natural Language Processing (NLP) plays an important role in our daily lives, particularly due to the enormous progress of Large Language Models (LLM). However, NLP has many fairness-critical use cases, e.g., as an expert system in recruitment or as an LLM-based tutor in education. Since NLP is based on human language, potentially harmful biases can diffuse into NLP systems and produce unfair results, discriminate against minorities or generate legal issues. Hence, it is important to develop a fairness certification for NLP approaches. We follow a qualitative research approach towards a fairness certification for NLP. In particular, we have reviewed a large body of literature on algorithmic fairness, and we have conducted semi-structured expert interviews with a wide range of experts from that area. We have systematically devised six fairness criteria for NLP, which can be further refined into 18 sub-categories. Our criteria offer a foundation for operationalizing and testing processes to certify fairness, both from the perspective of the auditor and the audited organization.

Related papers

Explainability in Practice: A Survey of Explainable NLP Across Various Domains [2.494550479408289]
Review explores explainable NLP (XNLP) with a focus on its practical deployment and real-world applications. The paper concludes by suggesting future research directions that could enhance the understanding and broader application of XNLP.
arXiv Detail & Related papers (2025-02-02T16:18:44Z)
A Novel Psychometrics-Based Approach to Developing Professional Competency Benchmark for Large Language Models [0.0]
We propose a comprehensive approach to benchmark development based on rigorous psychometric principles. We make the first attempt to illustrate this approach by creating a new benchmark in the field of pedagogy and education. We construct a novel benchmark guided by the Bloom's taxonomy and rigorously designed by a consortium of education experts trained in test development.
arXiv Detail & Related papers (2024-10-29T19:32:43Z)
Towards Systematic Monolingual NLP Surveys: GenA of Greek NLP [2.3499129784547663]
This study fills the gap by introducing a method for creating systematic and comprehensive monolingual NLP surveys. Characterized by a structured search protocol, it can be used to select publications and organize them through a taxonomy of NLP tasks. By applying our method, we conducted a systematic literature review of Greek NLP from 2012 to 2022.
arXiv Detail & Related papers (2024-07-13T12:01:52Z)
Analyzing and Adapting Large Language Models for Few-Shot Multilingual NLU: Are We There Yet? [82.02076369811402]
Supervised fine-tuning (SFT), supervised instruction tuning (SIT) and in-context learning (ICL) are three alternative, de facto standard approaches to few-shot learning. We present an extensive and systematic comparison of the three approaches, testing them on 6 high- and low-resource languages, three different NLU tasks, and a myriad of language and domain setups. Our observations show that supervised instruction tuning has the best trade-off between performance and resource requirements.
arXiv Detail & Related papers (2024-03-04T10:48:13Z)
Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets. This survey delves into an important attribute of these datasets: the dialect of a language. Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z)
Exploring the Reliability of Large Language Models as Customized Evaluators for Diverse NLP Tasks [65.69651759036535]
We analyze whether large language models (LLMs) can serve as reliable alternatives to humans. This paper explores both conventional tasks (e.g., story generation) and alignment tasks (e.g., math reasoning) We find that LLM evaluators can generate unnecessary criteria or omit crucial criteria, resulting in a slight deviation from the experts.
arXiv Detail & Related papers (2023-10-30T17:04:35Z)
NLPBench: Evaluating Large Language Models on Solving NLP Problems [41.01588131136101]
Large language models (LLMs) have shown promise in enhancing the capabilities of natural language processing (NLP) We present a unique benchmarking dataset, NLPBench, comprising 378 college-level NLP questions spanning various NLP topics sourced from Yale University's prior final exams. Our evaluation, centered on LLMs such as GPT-3.5/4, PaLM-2, and LLAMA-2, incorporates advanced prompting strategies like the chain-of-thought (CoT) and tree-of-thought (ToT)
arXiv Detail & Related papers (2023-09-27T13:02:06Z)
Examining risks of racial biases in NLP tools for child protective services [78.81107364902958]
We focus on one such setting: child protective services (CPS) Given well-established racial bias in this setting, we investigate possible ways deployed NLP is liable to increase racial disparities. We document consistent algorithmic unfairness in NER models, possible algorithmic unfairness in coreference resolution models, and little evidence of exacerbated racial bias in risk prediction.
arXiv Detail & Related papers (2023-05-30T21:00:47Z)
Lessons Learned from a Citizen Science Project for Natural Language Processing [53.48988266271858]
Citizen Science is an alternative to crowdsourcing that is relatively unexplored in the context of NLP. We conduct an exploratory study into engaging different groups of volunteers in Citizen Science for NLP by re-annotating parts of a pre-existing crowdsourced dataset. Our results show that this can yield high-quality annotations and attract motivated volunteers, but also requires considering factors such as scalability, participation over time, and legal and ethical issues.
arXiv Detail & Related papers (2023-04-25T14:08:53Z)
A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs) For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge. The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z)
A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing [68.37496795076203]
We provide guidance for NLP researchers and practitioners dealing with imbalanced data. We first discuss various types of controlled and real-world class imbalance. We organize the methods by whether they are based on sampling, data augmentation, choice of loss function, staged learning, or model design.
arXiv Detail & Related papers (2022-10-10T13:26:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.