A Material Lens on Coloniality in NLP
- URL: http://arxiv.org/abs/2311.08391v1
- Date: Tue, 14 Nov 2023 18:52:09 GMT
- Title: A Material Lens on Coloniality in NLP
- Authors: William Held, Camille Harris, Michael Best, Diyi Yang
- Abstract summary: Coloniality is the continuation of colonial harms beyond "official" colonization.
We argue that coloniality is implicitly embedded in and amplified by NLP data, algorithms, and software.
- Score: 57.63027898794855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Coloniality, the continuation of colonial harms beyond "official"
colonization, has pervasive effects across society and scientific fields.
Natural Language Processing (NLP) is no exception to this broad phenomenon. In
this work, we argue that coloniality is implicitly embedded in and amplified by
NLP data, algorithms, and software. We formalize this analysis using
Actor-Network Theory (ANT): an approach to understanding social phenomena
through the network of relationships between human stakeholders and technology.
We use our Actor-Network to guide a quantitative survey of the geography of
different phases of NLP research, providing evidence that inequality along
colonial boundaries increases as NLP builds on itself. Based on this, we argue
that combating coloniality in NLP requires not only changing current values but
also active work to remove the accumulation of colonial ideals in our
foundational data and algorithms.
Related papers
- The Nature of NLP: Analyzing Contributions in NLP Papers [77.31665252336157]
We quantitatively investigate what constitutes NLP research by examining research papers.
Our findings reveal a rising involvement of machine learning in NLP since the early nineties.
In post-2020, there has been a resurgence of focus on language and people.
arXiv Detail & Related papers (2024-09-29T01:29:28Z) - Reimagining Communities through Transnational Bengali Decolonial Discourse with YouTube Content Creators [2.3977477455080085]
This research seeks to understand people's motivations and strategies for engaging in video-mediated decolonial discourse.
We discuss how our work demonstrates the potential of the sociomateriality of decolonial discourse online.
arXiv Detail & Related papers (2024-07-18T03:41:39Z) - Decolonial AI as Disenclosure [0.0]
Machine learning and AI engender 'AI colonialism', a term that conceptually overlaps with 'data colonialism', as a form of injustice.
Politically, it enforces digital capitalism's hegemony. Ecologically, it negatively impacts the environment and intensifies the extraction of natural resources and consumption of energy.
arXiv Detail & Related papers (2024-05-23T09:45:37Z) - Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets.
This survey delves into an important attribute of these datasets: the dialect of a language.
Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z) - In Consideration of Indigenous Data Sovereignty: Data Mining as a
Colonial Practice [0.0]
This research stresses the need for the inclusion of Indigenous Data Sovereignty.
To support this hypothesis and address the problem, the CARE Principles for Indigenous Data Governance are applied.
arXiv Detail & Related papers (2023-09-19T00:00:35Z) - Decolonial AI Alignment: Openness, Viśe\d{s}a-Dharma, and Including Excluded Knowledges [22.21928139733195]
I argue that colonialism has a history of altering the beliefs and values of colonized peoples.
I suggest that AI alignment be decolonialized using three forms of openness.
One concept used is vi'sedsa-dharma, or context-specific notions of right and wrong.
arXiv Detail & Related papers (2023-09-10T14:04:21Z) - Beyond Good Intentions: Reporting the Research Landscape of NLP for
Social Good [115.1507728564964]
We introduce NLP4SG Papers, a scientific dataset with three associated tasks.
These tasks help identify NLP4SG papers and characterize the NLP4SG landscape.
We use state-of-the-art NLP models to address each of these tasks and use them on the entire ACL Anthology.
arXiv Detail & Related papers (2023-05-09T14:16:25Z) - Geographic Citation Gaps in NLP Research [63.13508571014673]
This work asks a series of questions on the relationship between geographical location and publication success.
We first created a dataset of 70,000 papers from the ACL Anthology, extracted their meta-information, and generated their citation network.
We show that not only are there substantial geographical disparities in paper acceptance and citation but also that these disparities persist even when controlling for a number of variables such as venue of publication and sub-field of NLP.
arXiv Detail & Related papers (2022-10-26T02:25:23Z) - Re-contextualizing Fairness in NLP: The Case of India [9.919007681131804]
We focus on NLP fair-ness in the context of India.
We build resources for fairness evaluation in the Indian context.
We then delve deeper into social stereotypes for Region andReligion, demonstrating its prevalence in corpora and models.
arXiv Detail & Related papers (2022-09-25T13:56:13Z) - Grounding 'Grounding' in NLP [59.28887479119075]
As a community, we use the term broadly to reference any linking of text to data or non-textual modality.
Cognitive Science more formally defines "grounding" as the process of establishing what mutual information is required for successful communication.
arXiv Detail & Related papers (2021-06-04T00:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.