Re-contextualizing Fairness in NLP: The Case of India
- URL: http://arxiv.org/abs/2209.12226v5
- Date: Mon, 21 Nov 2022 06:29:18 GMT
- Title: Re-contextualizing Fairness in NLP: The Case of India
- Authors: Shaily Bhatt, Sunipa Dev, Partha Talukdar, Shachi Dave, Vinodkumar
Prabhakaran
- Abstract summary: We focus on NLP fair-ness in the context of India.
We build resources for fairness evaluation in the Indian context.
We then delve deeper into social stereotypes for Region andReligion, demonstrating its prevalence in corpora and models.
- Score: 9.919007681131804
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent research has revealed undesirable biases in NLP data and models.
However, these efforts focus on social disparities in West, and are not
directly portable to other geo-cultural contexts. In this paper, we focus on
NLP fair-ness in the context of India. We start with a brief account of the
prominent axes of social disparities in India. We build resources for fairness
evaluation in the Indian context and use them to demonstrate prediction biases
along some of the axes. We then delve deeper into social stereotypes for Region
andReligion, demonstrating its prevalence in corpora and models. Finally, we
outline a holistic research agenda to re-contextualize NLP fairness research
for the Indian context, ac-counting for Indian societal context, bridging
technological gaps in NLP capabilities and re-sources, and adapting to Indian
cultural values. While we focus on India, this framework can be generalized to
other geo-cultural contexts.
Related papers
- The Call for Socially Aware Language Technologies [94.6762219597438]
We argue that many of these issues share a common core: a lack of awareness of the factors, context, and implications of the social environment in which NLP operates.
We argue that substantial challenges remain for NLP to develop social awareness and that we are just at the beginning of a new era for the field.
arXiv Detail & Related papers (2024-05-03T18:12:39Z) - A Material Lens on Coloniality in NLP [57.63027898794855]
Coloniality is the continuation of colonial harms beyond "official" colonization.
We argue that coloniality is implicitly embedded in and amplified by NLP data, algorithms, and software.
arXiv Detail & Related papers (2023-11-14T18:52:09Z) - Indian-BhED: A Dataset for Measuring India-Centric Biases in Large Language Models [18.201326983938014]
Large Language Models (LLMs) can encode societal biases, exposing their users to representational harms.
We quantify stereotypical bias in popular LLMs according to an Indian-centric frame through Indian-BhED, a first of its kind dataset.
We find that the majority of LLMs tested have a strong propensity to output stereotypes in the Indian context.
arXiv Detail & Related papers (2023-09-15T17:38:41Z) - Are Models Trained on Indian Legal Data Fair? [20.162205920441895]
We present an initial investigation of fairness from the Indian perspective in the legal domain.
We show that a decision tree model trained for the bail prediction task has an overall fairness disparity of 0.237 between input features associated with Hindus and Muslims.
arXiv Detail & Related papers (2023-03-13T16:20:33Z) - Cultural Re-contextualization of Fairness Research in Language
Technologies in India [9.919007681131804]
Recent research has revealed undesirable biases in NLP data and models.
We re-contextualize fairness research for the Indian context, accounting for Indian societal context.
We also summarize findings from an empirical study on various social biases along different axes of disparities relevant to India.
arXiv Detail & Related papers (2022-11-21T06:37:45Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Geographic Citation Gaps in NLP Research [63.13508571014673]
This work asks a series of questions on the relationship between geographical location and publication success.
We first created a dataset of 70,000 papers from the ACL Anthology, extracted their meta-information, and generated their citation network.
We show that not only are there substantial geographical disparities in paper acceptance and citation but also that these disparities persist even when controlling for a number of variables such as venue of publication and sub-field of NLP.
arXiv Detail & Related papers (2022-10-26T02:25:23Z) - Square One Bias in NLP: Towards a Multi-Dimensional Exploration of the
Research Manifold [88.83876819883653]
We show through a manual classification of recent NLP research papers that this is indeed the case.
We observe that NLP research often goes beyond the square one setup, focusing not only on accuracy, but also on fairness or interpretability, but typically only along a single dimension.
arXiv Detail & Related papers (2022-06-20T13:04:23Z) - Re-imagining Algorithmic Fairness in India and Beyond [9.667710168953239]
We de-center algorithmic fairness and analyse AI power in India.
We find that data is not always reliable due to socio-economic factors.
We provide a roadmap to re-contextualise data and models, empower oppressed communities, and enable Fair-ML ecosystems.
arXiv Detail & Related papers (2021-01-25T10:20:57Z) - Non-portability of Algorithmic Fairness in India [9.8164690355257]
We argue that a mere translation of technical fairness work to Indian subgroups may serve only as a window dressing.
We argue that a collective re-imagining of Fair-ML, by re-contextualising data and models, empowering oppressed communities, and more importantly, enabling ecosystems.
arXiv Detail & Related papers (2020-12-03T23:14:13Z) - Towards Debiasing Sentence Representations [109.70181221796469]
We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks.
We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
arXiv Detail & Related papers (2020-07-16T04:22:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.