Assessing Group-level Gender Bias in Professional Evaluations: The Case
of Medical Student End-of-Shift Feedback
- URL: http://arxiv.org/abs/2206.00234v1
- Date: Wed, 1 Jun 2022 05:01:36 GMT
- Title: Assessing Group-level Gender Bias in Professional Evaluations: The Case
of Medical Student End-of-Shift Feedback
- Authors: Emmy Liu, Michael Henry Tessler, Nicole Dubosh, Katherine Mosher
Hiller, Roger Levy
- Abstract summary: Female physicians tend to be underrepresented in senior positions, make less money than their male counterparts and receive fewer promotions.
This work was mainly conducted by looking for specific words using fixed dictionaries such as LIWC and focused on recommendation letters.
We use a dataset of written and quantitative assessments of medical student performance on individual shifts of work, collected across multiple institutions, to investigate the extent to which gender bias exists in a day-to-day context for medical students.
- Score: 14.065979111248497
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although approximately 50% of medical school graduates today are women,
female physicians tend to be underrepresented in senior positions, make less
money than their male counterparts and receive fewer promotions. There is a
growing body of literature demonstrating gender bias in various forms of
evaluation in medicine, but this work was mainly conducted by looking for
specific words using fixed dictionaries such as LIWC and focused on
recommendation letters. We use a dataset of written and quantitative
assessments of medical student performance on individual shifts of work,
collected across multiple institutions, to investigate the extent to which
gender bias exists in a day-to-day context for medical students. We investigate
differences in the narrative comments given to male and female students by both
male or female faculty assessors, using a fine-tuned BERT model. This allows us
to examine whether groups are written about in systematically different ways,
without relying on hand-crafted wordlists or topic models. We compare these
results to results from the traditional LIWC method and find that, although we
find no evidence of group-level gender bias in this dataset, terms related to
family and children are used more in feedback given to women.
Related papers
- Evaluating Gender Bias in Large Language Models [0.8636148452563583]
The study examines the extent to which Large Language Models (LLMs) exhibit gender bias in pronoun selection in occupational contexts.
The jobs considered include a range of occupations, from those with a significant male presence to those with a notable female concentration.
The results show a positive correlation between the models' pronoun choices and the gender distribution present in U.S. labor force data.
arXiv Detail & Related papers (2024-11-14T22:23:13Z) - Unveiling Gender Bias in Large Language Models: Using Teacher's Evaluation in Higher Education As an Example [0.0]
This paper investigates gender bias in Large Language Model (LLM)-generated teacher evaluations in higher education setting.
It applies a comprehensive analytical framework that includes Odds Ratio (OR) analysis, Word Embedding Association Test (WEAT), sentiment analysis, and contextual analysis.
Specifically, words related to approachability and support were used more frequently for female instructors, while words related to entertainment were predominantly used for male instructors.
arXiv Detail & Related papers (2024-09-15T07:50:33Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - Detecting Gender Bias in Course Evaluations [0.0]
We use different methods to examine and explore the data and find differences in what students write about courses depending on gender of the examiner.
Data from English and Swedish courses are evaluated and compared, in order to capture more nuance in the gender bias that might be found.
arXiv Detail & Related papers (2024-04-02T11:35:05Z) - Gender Inflected or Bias Inflicted: On Using Grammatical Gender Cues for
Bias Evaluation in Machine Translation [0.0]
We use Hindi as the source language and construct two sets of gender-specific sentences to evaluate different Hindi-English (HI-EN) NMT systems.
Our work highlights the importance of considering the nature of language when designing such extrinsic bias evaluation datasets.
arXiv Detail & Related papers (2023-11-07T07:09:59Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Towards Understanding Gender-Seniority Compound Bias in Natural Language
Generation [64.65911758042914]
We investigate how seniority impacts the degree of gender bias exhibited in pretrained neural generation models.
Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.
These results suggest that NLP applications built using GPT-2 may harm women in professional capacities.
arXiv Detail & Related papers (2022-05-19T20:05:02Z) - A Survey on Gender Bias in Natural Language Processing [22.91475787277623]
We present a survey of 304 papers on gender bias in natural language processing.
We compare and contrast approaches to detecting and mitigating gender bias.
We find that research on gender bias suffers from four core limitations.
arXiv Detail & Related papers (2021-12-28T14:54:18Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.