Assessing the Influence of Toxic and Gender Discriminatory Communication on Perceptible Diversity in OSS Projects
- URL: http://arxiv.org/abs/2403.08113v2
- Date: Thu, 14 Mar 2024 22:07:48 GMT
- Title: Assessing the Influence of Toxic and Gender Discriminatory Communication on Perceptible Diversity in OSS Projects
- Authors: Sayma Sultana, Gias Uddin, Amiangshu Bosu,
- Abstract summary: The presence of toxic and gender-identity derogatory language in open-source software (OSS) communities has recently become a focal point for researchers.
This study aims to investigate how such content influences the gender, ethnicity, and tenure diversity of open-source software development teams.
- Score: 2.526146573337397
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The presence of toxic and gender-identity derogatory language in open-source software (OSS) communities has recently become a focal point for researchers. Such comments not only lead to frustration and disengagement among developers but may also influence their leave from the OSS projects. Despite ample evidence suggesting that diverse teams enhance productivity, the existence of toxic or gender identity discriminatory communications poses a significant threat to the participation of individuals from marginalized groups and, as such, may act as a barrier to fostering diversity and inclusion in OSS projects. However, there is a notable lack of research dedicated to exploring the association between gender-based toxic and derogatory language with a perceptible diversity of open-source software teams. Consequently, this study aims to investigate how such content influences the gender, ethnicity, and tenure diversity of open-source software development teams. To achieve this, we extract data from active GitHub projects, assess various project characteristics, and identify instances of toxic and gender-discriminatory language within issue/pull request comments. Using these attributes, we construct a regression model to explore how they associate with the perceptible diversity of those projects.
Related papers
- The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [58.130894823145205]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias.
Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning.
We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z) - Investigating the Impact of Interpersonal Challenges on Feeling Welcome in OSS [20.055675387635212]
Interpersonal challenges can inhibit a feeling of welcomeness among contributors, particularly from underrepresented groups.
Here, we investigate the effects of interpersonal challenges on the sense of welcomeness among diverse populations within OSS.
We found that different challenges have unique impacts on how people feel welcomed, with variations across gender, race, and disability groups.
arXiv Detail & Related papers (2024-11-03T15:11:50Z) - Assessing the Level of Toxicity Against Distinct Groups in Bangla Social Media Comments: A Comprehensive Investigation [0.0]
This study focuses on identifying toxic comments in the Bengali language targeting three specific groups: transgender people, indigenous people, and migrant people.
The methodology involves creating a dataset, manual annotation, and employing pre-trained transformer models like Bangla-BERT, bangla-bert-base, distil-BERT, and Bert-base-multilingual-cased for classification.
The experimental findings reveal that Bangla-BERT surpasses alternative models, achieving an F1-score of 0.8903.
arXiv Detail & Related papers (2024-09-25T17:48:59Z) - The Unseen Targets of Hate -- A Systematic Review of Hateful Communication Datasets [15.593796580973937]
Machine learning tools can only be as capable as the quality of the data they are trained on allows them.
We present a systematic review of the datasets for the automated detection of hateful communication introduced over the past decade.
We find, overall, a skewed representation of selected target identities and mismatches between the targets that research conceptualizes and ultimately includes in datasets.
arXiv Detail & Related papers (2024-05-14T12:50:33Z) - Greater than the sum of its parts: The role of minority and majority
status in collaborative problem-solving communication [0.0]
Collaborative problem-solving (CPS) is a vital skill used both in the workplace and in educational environments.
Women and underrepresented minorities (URMs) often face obstacles during collaborative interactions that hinder their key participation in these problem-solving conversations.
Here, we explored the communication patterns of minority and non-minority individuals working together in a CPS task.
arXiv Detail & Related papers (2024-03-07T17:17:20Z) - Unveiling Diversity: Empowering OSS Project Leaders with Community
Diversity and Turnover Dashboards [51.67585198094836]
CommunityTapestry is a dynamic real-time community dashboard.
It presents key diversity and turnover signals that we identified from the literature.
It helped project leaders identify areas of improvement and gave them actionable information.
arXiv Detail & Related papers (2023-12-13T22:12:57Z) - Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z) - "I'm fully who I am": Towards Centering Transgender and Non-Binary
Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life.
We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation.
We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z) - Rumor Detection with Self-supervised Learning on Texts and Social Graph [101.94546286960642]
We propose contrastive self-supervised learning on heterogeneous information sources, so as to reveal their relations and characterize rumors better.
We term this framework as Self-supervised Rumor Detection (SRD)
Extensive experiments on three real-world datasets validate the effectiveness of SRD for automatic rumor detection on social media.
arXiv Detail & Related papers (2022-04-19T12:10:03Z) - Fragments of the Past: Curating Peer Support with Perpetrators of
Domestic Violence [88.37416552778178]
We report on a ten-month study where we worked with six support workers and eighteen perpetrators in the design and deployment of Fragments of the Past.
We share how crafting digitally-augmented artefacts - 'fragments' - of experiences of desisting from violence can translate messages for motivation and rapport between peers.
These insights provide the basis for practical considerations for future network design with challenging populations.
arXiv Detail & Related papers (2021-07-09T22:57:43Z) - Including Everyone, Everywhere: Understanding Opportunities and
Challenges of Geographic Gender-Inclusion in OSS [15.757897147034873]
This study presents a multi-region geographical analysis of gender inclusion on GitHub.
Gender diversity is low across all parts of the world, with no substantial difference across regions.
There has been statistically significant improvement in diversity worldwide since 2014, with certain regions such as Africa improving at faster pace.
arXiv Detail & Related papers (2020-10-02T07:40:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.