Fairness Certification for Natural Language Processing and Large
Language Models
- URL: http://arxiv.org/abs/2401.01262v2
- Date: Wed, 3 Jan 2024 08:17:53 GMT
- Title: Fairness Certification for Natural Language Processing and Large
Language Models
- Authors: Vincent Freiberger, Erik Buchmann
- Abstract summary: We follow a qualitative research approach towards a fairness certification for NLP approaches.
We have systematically devised six fairness criteria for NLP, which can be further refined into 18 sub-categories.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural Language Processing (NLP) plays an important role in our daily lives,
particularly due to the enormous progress of Large Language Models (LLM).
However, NLP has many fairness-critical use cases, e.g., as an expert system in
recruitment or as an LLM-based tutor in education. Since NLP is based on human
language, potentially harmful biases can diffuse into NLP systems and produce
unfair results, discriminate against minorities or generate legal issues.
Hence, it is important to develop a fairness certification for NLP approaches.
We follow a qualitative research approach towards a fairness certification for
NLP. In particular, we have reviewed a large body of literature on algorithmic
fairness, and we have conducted semi-structured expert interviews with a wide
range of experts from that area. We have systematically devised six fairness
criteria for NLP, which can be further refined into 18 sub-categories. Our
criteria offer a foundation for operationalizing and testing processes to
certify fairness, both from the perspective of the auditor and the audited
organization.
Related papers
- Explainability in Practice: A Survey of Explainable NLP Across Various Domains [2.494550479408289]
Review explores explainable NLP (XNLP) with a focus on its practical deployment and real-world applications.
The paper concludes by suggesting future research directions that could enhance the understanding and broader application of XNLP.
arXiv Detail & Related papers (2025-02-02T16:18:44Z) - A Novel Psychometrics-Based Approach to Developing Professional Competency Benchmark for Large Language Models [0.0]
We propose a comprehensive approach to benchmark development based on rigorous psychometric principles.
We make the first attempt to illustrate this approach by creating a new benchmark in the field of pedagogy and education.
We construct a novel benchmark guided by the Bloom's taxonomy and rigorously designed by a consortium of education experts trained in test development.
arXiv Detail & Related papers (2024-10-29T19:32:43Z) - Towards Systematic Monolingual NLP Surveys: GenA of Greek NLP [2.3499129784547663]
This study introduces a generalizable methodology for creating systematic and comprehensive monolingual NLP surveys.
We apply this methodology to Greek NLP (2012-2023), providing a comprehensive overview of its current state and challenges.
arXiv Detail & Related papers (2024-07-13T12:01:52Z) - Analyzing and Adapting Large Language Models for Few-Shot Multilingual
NLU: Are We There Yet? [82.02076369811402]
Supervised fine-tuning (SFT), supervised instruction tuning (SIT) and in-context learning (ICL) are three alternative, de facto standard approaches to few-shot learning.
We present an extensive and systematic comparison of the three approaches, testing them on 6 high- and low-resource languages, three different NLU tasks, and a myriad of language and domain setups.
Our observations show that supervised instruction tuning has the best trade-off between performance and resource requirements.
arXiv Detail & Related papers (2024-03-04T10:48:13Z) - Exploring the Reliability of Large Language Models as Customized Evaluators for Diverse NLP Tasks [65.69651759036535]
We analyze whether large language models (LLMs) can serve as reliable alternatives to humans.
This paper explores both conventional tasks (e.g., story generation) and alignment tasks (e.g., math reasoning)
We find that LLM evaluators can generate unnecessary criteria or omit crucial criteria, resulting in a slight deviation from the experts.
arXiv Detail & Related papers (2023-10-30T17:04:35Z) - Examining risks of racial biases in NLP tools for child protective
services [78.81107364902958]
We focus on one such setting: child protective services (CPS)
Given well-established racial bias in this setting, we investigate possible ways deployed NLP is liable to increase racial disparities.
We document consistent algorithmic unfairness in NER models, possible algorithmic unfairness in coreference resolution models, and little evidence of exacerbated racial bias in risk prediction.
arXiv Detail & Related papers (2023-05-30T21:00:47Z) - Lessons Learned from a Citizen Science Project for Natural Language
Processing [53.48988266271858]
Citizen Science is an alternative to crowdsourcing that is relatively unexplored in the context of NLP.
We conduct an exploratory study into engaging different groups of volunteers in Citizen Science for NLP by re-annotating parts of a pre-existing crowdsourced dataset.
Our results show that this can yield high-quality annotations and attract motivated volunteers, but also requires considering factors such as scalability, participation over time, and legal and ethical issues.
arXiv Detail & Related papers (2023-04-25T14:08:53Z) - A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs)
For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge.
The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z) - A Survey of Methods for Addressing Class Imbalance in Deep-Learning
Based Natural Language Processing [68.37496795076203]
We provide guidance for NLP researchers and practitioners dealing with imbalanced data.
We first discuss various types of controlled and real-world class imbalance.
We organize the methods by whether they are based on sampling, data augmentation, choice of loss function, staged learning, or model design.
arXiv Detail & Related papers (2022-10-10T13:26:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.