A methodology to characterize bias and harmful stereotypes in natural
language processing in Latin America
- URL: http://arxiv.org/abs/2207.06591v3
- Date: Tue, 28 Mar 2023 21:22:17 GMT
- Title: A methodology to characterize bias and harmful stereotypes in natural
language processing in Latin America
- Authors: Laura Alonso Alemany, Luciana Benotti, Hern\'an Maina, Luc\'ia
Gonz\'alez, Mariela Rajngewerc, Lautaro Mart\'inez, Jorge S\'anchez, Mauro
Schilman, Guido Ivetta, Alexia Halvorsen, Amanda Mata Rojo, Mat\'ias Bordone,
Beatriz Busaniche
- Abstract summary: We show how social scientists, domain experts, and machine learning experts can collaboratively explore biases and harmful stereotypes in word embeddings and large language models.
Our methodology is based on the following principles.
- Score: 2.05094736006609
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Automated decision-making systems, especially those based on natural language
processing, are pervasive in our lives. They are not only behind the internet
search engines we use daily, but also take more critical roles: selecting
candidates for a job, determining suspects of a crime, diagnosing autism and
more. Such automated systems make errors, which may be harmful in many ways, be
it because of the severity of the consequences (as in health issues) or because
of the sheer number of people they affect. When errors made by an automated
system affect a population more than others, we call the system
\textit{biased}.
Most modern natural language technologies are based on artifacts obtained
from enormous volumes of text using machine learning, namely language models
and word embeddings. Since they are created by applying subsymbolic machine
learning, mostly artificial neural networks, they are opaque and practically
uninterpretable by direct inspection, thus making it very difficult to audit
them.
In this paper, we present a methodology that spells out how social
scientists, domain experts, and machine learning experts can collaboratively
explore biases and harmful stereotypes in word embeddings and large language
models. Our methodology is based on the following principles:
* focus on the linguistic manifestations of discrimination on word embeddings
and language models, not on the mathematical properties of the models * reduce
the technical barrier for discrimination experts%, be it social scientists,
domain experts or other * characterize through a qualitative exploratory
process in addition to a metric-based approach * address mitigation as part of
the training process, not as an afterthought
Related papers
- Combatting Human Trafficking in the Cyberspace: A Natural Language
Processing-Based Methodology to Analyze the Language in Online Advertisements [55.2480439325792]
This project tackles the pressing issue of human trafficking in online C2C marketplaces through advanced Natural Language Processing (NLP) techniques.
We introduce a novel methodology for generating pseudo-labeled datasets with minimal supervision, serving as a rich resource for training state-of-the-art NLP models.
A key contribution is the implementation of an interpretability framework using Integrated Gradients, providing explainable insights crucial for law enforcement.
arXiv Detail & Related papers (2023-11-22T02:45:01Z) - Towards Bridging the Digital Language Divide [4.234367850767171]
multilingual language processing systems often exhibit a hardwired, yet usually involuntary and hidden representational preference towards certain languages.
We show that biased technology is often the result of research and development methodologies that do not do justice to the complexity of the languages being represented.
We present a new initiative that aims at reducing linguistic bias through both technological design and methodology.
arXiv Detail & Related papers (2023-07-25T10:53:20Z) - National Origin Discrimination in Deep-learning-powered Automated Resume
Screening [3.251347385432286]
Many companies and organizations have started to use some form of AIenabled auto mated tools to assist in their hiring process.
There are increasing concerns on unfair treatment to candidates, caused by underlying bias in AI systems.
This study examined deep learning methods, a recent technology breakthrough, with focus on their application to automated resume screening.
arXiv Detail & Related papers (2023-07-13T01:35:29Z) - Ecosystem-level Analysis of Deployed Machine Learning Reveals Homogeneous Outcomes [72.13373216644021]
We study the societal impact of machine learning by considering the collection of models that are deployed in a given context.
We find deployed machine learning is prone to systemic failure, meaning some users are exclusively misclassified by all models available.
These examples demonstrate ecosystem-level analysis has unique strengths for characterizing the societal impact of machine learning.
arXiv Detail & Related papers (2023-07-12T01:11:52Z) - Language-Driven Representation Learning for Robotics [115.93273609767145]
Recent work in visual representation learning for robotics demonstrates the viability of learning from large video datasets of humans performing everyday tasks.
We introduce a framework for language-driven representation learning from human videos and captions.
We find that Voltron's language-driven learning outperform the prior-of-the-art, especially on targeted problems requiring higher-level control.
arXiv Detail & Related papers (2023-02-24T17:29:31Z) - Language technology practitioners as language managers: arbitrating data
bias and predictive bias in ASR [0.0]
We use the lens of language policy to analyse how current practices in training and testing ASR systems in industry lead to the data bias giving rise to these systematic error differences.
We propose a re-framing of language resources as (public) infrastructure which should not solely be designed for markets, but for, and with meaningful cooperation of, speech communities.
arXiv Detail & Related papers (2022-02-25T10:37:52Z) - Capturing Failures of Large Language Models via Human Cognitive Biases [18.397404180932373]
We show that OpenAI's Codex errs based on how the input prompt is framed, adjusts outputs towards anchors, and is biased towards outputs that mimic frequent training examples.
Our experiments suggest that cognitive science can be a useful jumping-off point to better understand how contemporary machine learning systems behave.
arXiv Detail & Related papers (2022-02-24T18:58:52Z) - My Teacher Thinks The World Is Flat! Interpreting Automatic Essay
Scoring Mechanism [71.34160809068996]
Recent work shows that automated scoring systems are prone to even common-sense adversarial samples.
We utilize recent advances in interpretability to find the extent to which features such as coherence, content and relevance are important for automated scoring mechanisms.
We also find that since the models are not semantically grounded with world-knowledge and common sense, adding false facts such as the world is flat'' actually increases the score instead of decreasing it.
arXiv Detail & Related papers (2020-12-27T06:19:20Z) - Curious Case of Language Generation Evaluation Metrics: A Cautionary
Tale [52.663117551150954]
A few popular metrics remain as the de facto metrics to evaluate tasks such as image captioning and machine translation.
This is partly due to ease of use, and partly because researchers expect to see them and know how to interpret them.
In this paper, we urge the community for more careful consideration of how they automatically evaluate their models.
arXiv Detail & Related papers (2020-10-26T13:57:20Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.