Related papers: raceBERT -- A Transformer-based Model for Predicting Race and Ethnicity from Names

raceBERT -- A Transformer-based Model for Predicting Race and Ethnicity from Names

URL: http://arxiv.org/abs/2112.03807v3
Date: Thu, 9 Dec 2021 05:09:26 GMT
Title: raceBERT -- A Transformer-based Model for Predicting Race and Ethnicity from Names
Authors: Prasanna Parasurama
Abstract summary: raceBERT is a transformer-based model for predicting race and ethnicity from character sequences in names. It achieves state-of-the-art results in race prediction using names, with an average f1-score of 0.86 -- a 4.1% improvement over the previous state-of-the-art, and improvements between 15-17% for non-white names.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents raceBERT -- a transformer-based model for predicting race and ethnicity from character sequences in names, and an accompanying python package. Using a transformer-based model trained on a U.S. Florida voter registration dataset, the model predicts the likelihood of a name belonging to 5 U.S. census race categories (White, Black, Hispanic, Asian & Pacific Islander, American Indian & Alaskan Native). I build on Sood and Laohaprapanon (2018) by replacing their LSTM model with transformer-based models (pre-trained BERT model, and a roBERTa model trained from scratch), and compare the results. To the best of my knowledge, raceBERT achieves state-of-the-art results in race prediction using names, with an average f1-score of 0.86 -- a 4.1% improvement over the previous state-of-the-art, and improvements between 15-17% for non-white names.

Related papers

Video Prediction Transformers without Recurrence or Convolution [65.93130697098658]
We propose PredFormer, a framework entirely based on Gated Transformers. We provide a comprehensive analysis of 3D Attention in the context of video prediction. The significant improvements in both accuracy and efficiency highlight the potential of PredFormer.
arXiv Detail & Related papers (2024-10-07T03:52:06Z)
Multicultural Name Recognition For Previously Unseen Names [65.268245109828]
This paper attempts to improve recognition of person names, a diverse category that can grow any time someone is born or changes their name. I look at names from 103 countries to compare how well the model performs on names from different cultures. I find that a model with combined character and word input outperforms word-only models and may improve on accuracy compared to classical NER models.
arXiv Detail & Related papers (2024-01-23T17:58:38Z)
Can We Trust Race Prediction? [0.0]
I train a Bidirectional Long Short-Term Memory (BiLSTM) model on a novel dataset of voter registration data from all 50 US states. I construct the most comprehensive database of first and surname distributions in the US. I provide the first high-quality benchmark dataset in order to fairly compare existing models and aid future model developers.
arXiv Detail & Related papers (2023-07-17T13:59:07Z)
TEDB System Description to a Shared Task on Euphemism Detection 2022 [0.0]
We considered Transformer-based models which are the current state-of-the-art methods for text classification. Our best result of 0.816 F1-score consists of a euphemism-detection-finetuned/TimeLMs-pretrained RoBERTa model as a feature extractor.
arXiv Detail & Related papers (2023-01-16T20:37:56Z)
Boosted Dynamic Neural Networks [53.559833501288146]
A typical EDNN has multiple prediction heads at different layers of the network backbone. To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data. Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions. We formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively.
arXiv Detail & Related papers (2022-11-30T04:23:12Z)
Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference [72.61732440246954]
Large pre-trained language models often lack logical consistency across test inputs. We propose a framework, ConCoRD, for boosting the consistency and accuracy of pre-trained NLP models. We show that ConCoRD consistently boosts accuracy and consistency of off-the-shelf closed-book QA and VQA models.
arXiv Detail & Related papers (2022-11-21T21:58:30Z)
Predicting Issue Types with seBERT [85.74803351913695]
seBERT is a model that was developed based on the BERT architecture, but trained from scratch with software engineering data. We fine-tuned this model for the NLBSE challenge for the task of issue type prediction. Our model dominates the baseline fastText for all three issue types in both recall and precisio to achieve an overall F1-score of 85.7%.
arXiv Detail & Related papers (2022-05-03T06:47:13Z)
Rethnicity: Predicting Ethnicity from Names [0.0]
I use the Bidirectional LSTM as the model and Florida Voter Registration as training data. Special care is given for the accuracy of minority groups, by adjusting the imbalance in the dataset.
arXiv Detail & Related papers (2021-09-19T21:30:22Z)
BERT Fine-Tuning for Sentiment Analysis on Indonesian Mobile Apps Reviews [1.5749416770494706]
This study examines the effectiveness of fine-tuning BERT for sentiment analysis using two different pre-trained models. The dataset used is Indonesian user reviews of the ten best apps in 2020 in Google Play sites. Two training data labeling approaches were also tested to determine the effectiveness of the model, which is score-based and lexicon-based.
arXiv Detail & Related papers (2021-07-14T16:00:15Z)
Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of Pre-trained Models' Transferability [74.11825654535895]
We investigate whether the power of the models pre-trained on text data, such as BERT, can be transferred to general token sequence classification applications. We find that even on non-text data, the models pre-trained on text converge faster than the randomly models.
arXiv Detail & Related papers (2021-03-12T09:19:14Z)
AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses [97.50616524350123]
We build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering. The first model, MinAvgOut, directly maximizes the diversity score through the output distributions of each batch. The second model, Label Fine-Tuning (LFT), prepends to the source sequence a label continuously scaled by the diversity score to control the diversity level. The third model, RL, adopts Reinforcement Learning and treats the diversity score as a reward signal.
arXiv Detail & Related papers (2020-01-15T18:32:06Z)
Predicting Race and Ethnicity From the Sequence of Characters in a Name [0.0]
We model the relationship between characters in a name and race and ethnicity using various techniques. A model using Long Short-Term Memory works best with out-of-sample accuracy of.85. The best-performing last-name model achieves out-of-sample accuracy of.81.
arXiv Detail & Related papers (2018-05-05T20:04:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.