Related papers: Twitter-Based Gender Recognition Using Transformers

Twitter-Based Gender Recognition Using Transformers

URL: http://arxiv.org/abs/2205.06801v1
Date: Sun, 24 Apr 2022 19:58:42 GMT
Title: Twitter-Based Gender Recognition Using Transformers
Authors: Zahra Movahedi Nia, Ali Ahmadi, Bruce Mellado, Jianhong Wu, James Orbinski, Ali Agary, Jude Dzevela Kong
Abstract summary: We propose a model based on transformers to predict the user's gender from their images and tweets. We fine-tune another model based on Bidirectional Representations from Transformers (ViTBERT) to recognize the user's gender by their tweets. The combination model improves the accuracy of image and text classification models by 6.98% and 4.43%, respectively.
Score: 2.539920413471809
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Social media contains useful information about people and the society that could help advance research in many different areas (e.g. by applying opinion mining, emotion/sentiment analysis, and statistical analysis) such as business and finance, health, socio-economic inequality and gender vulnerability. User demographics provide rich information that could help study the subject further. However, user demographics such as gender are considered private and are not freely available. In this study, we propose a model based on transformers to predict the user's gender from their images and tweets. We fine-tune a model based on Vision Transformers (ViT) to stratify female and male images. Next, we fine-tune another model based on Bidirectional Encoders Representations from Transformers (BERT) to recognize the user's gender by their tweets. This is highly beneficial, because not all users provide an image that indicates their gender. The gender of such users could be detected form their tweets. The combination model improves the accuracy of image and text classification models by 6.98% and 4.43%, respectively. This shows that the image and text classification models are capable of complementing each other by providing additional information to one another. We apply our method to the PAN-2018 dataset, and obtain an accuracy of 85.52%.

Related papers

A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models [45.55471356313678]
This paper presents the first large-scale study on gender bias in text-to-image (T2I) models. We create a dataset of 3,217 gender-neutral prompts and generate 200 images per prompt from five leading T2I models. We automatically detect the perceived gender of people in the generated images and filter out images with no person or multiple people of different genders.
arXiv Detail & Related papers (2025-03-30T11:11:51Z)
The Male CEO and the Female Assistant: Evaluation and Mitigation of Gender Biases in Text-To-Image Generation of Dual Subjects [58.27353205269664]
We propose the Paired Stereotype Test (PST) framework, which queries T2I models to depict two individuals assigned with male-stereotyped and female-stereotyped social identities. PST queries T2I models to depict two individuals assigned with male-stereotyped and female-stereotyped social identities. Using PST, we evaluate two aspects of gender biases -- the well-known bias in gendered occupation and a novel aspect: bias in organizational power.
arXiv Detail & Related papers (2024-02-16T21:32:27Z)
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z)
Stereotypes and Smut: The (Mis)representation of Non-cisgender Identities by Text-to-Image Models [6.92043136971035]
We investigate how multimodal models handle diverse gender identities. We find certain non-cisgender identities are consistently (mis)represented as less human, more stereotyped and more sexualised. These improvements could pave the way for a future where change is led by the affected community.
arXiv Detail & Related papers (2023-05-26T16:28:49Z)
Auditing Gender Presentation Differences in Text-to-Image Models [54.16959473093973]
We study how gender is presented differently in text-to-image models. By probing gender indicators in the input text, we quantify the frequency differences of presentation-centric attributes. We propose an automatic method to estimate such differences.
arXiv Detail & Related papers (2023-02-07T18:52:22Z)
Gender Artifacts in Visual Datasets [34.74191865400569]
We investigate what $textitgender artifacts$ exist within large-scale visual datasets. We find that gender artifacts are ubiquitous in the COCO and OpenImages datasets. We claim that attempts to remove gender artifacts from such datasets are largely infeasible.
arXiv Detail & Related papers (2022-06-18T12:09:19Z)
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models [73.12069620086311]
We investigate the visual reasoning capabilities and social biases of text-to-image models. First, we measure three visual reasoning skills: object recognition, object counting, and spatial relation understanding. Second, we assess the gender and skin tone biases by measuring the gender/skin tone distribution of generated images.
arXiv Detail & Related papers (2022-02-08T18:36:52Z)
Gender prediction using limited Twitter Data [0.0]
This paper explores the usability of BERT (a Transformer model for word embedding) for gender prediction on social media. A Dutch BERT model is fine-tuned on different samples of a Dutch Twitter dataset labeled for gender, varying in the number of tweets used per person. Results show that even with relatively small amounts of data, BERT can be fine-tuned to accurately help predict the gender of Twitter users.
arXiv Detail & Related papers (2020-09-29T11:46:07Z)
Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women. We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
Large-scale Gender/Age Prediction of Tumblr Users [5.063421139422184]
We propose graph based and deep learning models for age and gender predictions. For graph based models, we come up with two approaches, network embedding and label propagation, to generate connection features. For deep learning models, we leverage convolutional neural network (CNN) and multilayer perceptron (MLP) to prediction users' age and gender.
arXiv Detail & Related papers (2020-01-02T19:01:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.