Global Inequalities in the Production of Artificial Intelligence: A Four-Country Study on Data Work
- URL: http://arxiv.org/abs/2410.14230v1
- Date: Fri, 18 Oct 2024 07:23:17 GMT
- Title: Global Inequalities in the Production of Artificial Intelligence: A Four-Country Study on Data Work
- Authors: Antonio A. Casilli, Paola Tubaro, Maxime Cornet, Clément Le Ludec, Juana Torres-Cierpe, Matheus Viana Braz,
- Abstract summary: Labor plays a major, albeit largely unrecognized role in the development of artificial intelligence.
Online platforms and networks of subcontractors recruit data workers to execute tasks in the shadow of AI production.
This study unveils the resulting complexities by comparing the working conditions and the profiles of data workers in Venezuela, Brazil, Madagascar, and as an example of a richer country, France.
- Score: 0.0
- License:
- Abstract: Labor plays a major, albeit largely unrecognized role in the development of artificial intelligence. Machine learning algorithms are predicated on data-intensive processes that rely on humans to execute repetitive and difficult-to-automate, but no less essential, tasks such as labeling images, sorting items in lists, recording voice samples, and transcribing audio files. Online platforms and networks of subcontractors recruit data workers to execute such tasks in the shadow of AI production, often in lower-income countries with long-standing traditions of informality and lessregulated labor markets. This study unveils the resulting complexities by comparing the working conditions and the profiles of data workers in Venezuela, Brazil, Madagascar, and as an example of a richer country, France. By leveraging original data collected over the years 2018-2023 via a mixed-method design, we highlight how the cross-country supply chains that link data workers to core AI production sites are reminiscent of colonial relationships, maintain historical economic dependencies, and generate inequalities that compound with those inherited from the past. The results also point to the importance of less-researched, non-English speaking countries to understand key features of the production of AI solutions at planetary scale.
Related papers
- Towards the Terminator Economy: Assessing Job Exposure to AI through LLMs [10.844598404826355]
One-third of U.S. employment is highly exposed to AI, primarily in high-skill jobs.
This exposure correlates positively with employment and wage growth from 2019 to 2023.
arXiv Detail & Related papers (2024-07-27T08:14:18Z) - Leveraging Data Augmentation for Process Information Extraction [0.0]
We investigate the application of data augmentation for natural language text data.
Data augmentation is an important component in enabling machine learning methods for the task of business process model generation from natural language text.
arXiv Detail & Related papers (2024-04-11T06:32:03Z) - STAR: Boosting Low-Resource Information Extraction by Structure-to-Text
Data Generation with Large Language Models [56.27786433792638]
STAR is a data generation method that leverages Large Language Models (LLMs) to synthesize data instances.
We design fine-grained step-by-step instructions to obtain the initial data instances.
Our experiments show that the data generated by STAR significantly improve the performance of low-resource event extraction and relation extraction tasks.
arXiv Detail & Related papers (2023-05-24T12:15:19Z) - The Dimensions of Data Labor: A Road Map for Researchers, Activists, and
Policymakers to Empower Data Producers [14.392208044851976]
Data producers have little say in what data is captured, how it is used, or who it benefits.
Organizations with the ability to access and process this data, e.g. OpenAI and Google, possess immense power in shaping the technology landscape.
By synthesizing related literature that reconceptualizes the production of data for computing as data labor'', we outline opportunities for researchers, policymakers, and activists to empower data producers.
arXiv Detail & Related papers (2023-05-22T17:11:22Z) - Why is AI not a Panacea for Data Workers? An Interview Study on Human-AI
Collaboration in Data Storytelling [59.08591308749448]
We interviewed eighteen data workers from both industry and academia to learn where and how they would like to collaborate with AI.
Surprisingly, though the participants showed excitement about collaborating with AI, many of them also expressed reluctance and pointed out nuanced reasons.
arXiv Detail & Related papers (2023-04-17T15:30:05Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Data-centric AI: Perspectives and Challenges [51.70828802140165]
Data-centric AI (DCAI) advocates a fundamental shift from model advancements to ensuring data quality and reliability.
We bring together three general missions: training data development, inference data development, and data maintenance.
arXiv Detail & Related papers (2023-01-12T05:28:59Z) - Privacy Adhering Machine Un-learning in NLP [66.17039929803933]
In real world industry use Machine Learning to build models on user data.
Such mandates require effort both in terms of data as well as model retraining.
continuous removal of data and model retraining steps do not scale.
We propose textitMachine Unlearning to tackle this challenge.
arXiv Detail & Related papers (2022-12-19T16:06:45Z) - Deep Learning based pipeline for anomaly detection and quality
enhancement in industrial binder jetting processes [68.8204255655161]
Anomaly detection describes methods of finding abnormal states, instances or data points that differ from a normal value space.
This paper contributes to a data-centric way of approaching artificial intelligence in industrial production.
arXiv Detail & Related papers (2022-09-21T08:14:34Z) - The Data-Production Dispositif [0.0]
This paper investigates outsourced machine learning data work in Latin America by studying three platforms in Venezuela and a BPO in Argentina.
We lean on the Foucauldian notion of dispositif to define the data-production dispositif as an ensemble of discourses, actions, and objects strategically disposed to (re)produce power/knowledge relations in data and labor.
We conclude by stressing the importance of counteracting the data-production dispositif by fighting alienation and precarization, and empowering data workers to become assets in the quest for high-quality data.
arXiv Detail & Related papers (2022-05-24T10:51:05Z) - DataOps for Societal Intelligence: a Data Pipeline for Labor Market
Skills Extraction and Matching [5.842787579447653]
We formulate and solve this problem using DataOps models.
We then focus on the critical task of skills extraction from resumes.
We showcase preliminary results with applied machine learning on real data.
arXiv Detail & Related papers (2021-04-05T15:37:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.