Can Demographic Factors Improve Text Classification? Revisiting
Demographic Adaptation in the Age of Transformers
- URL: http://arxiv.org/abs/2210.07362v2
- Date: Tue, 9 May 2023 08:56:48 GMT
- Title: Can Demographic Factors Improve Text Classification? Revisiting
Demographic Adaptation in the Age of Transformers
- Authors: Chia-Chien Hung, Anne Lauscher, Dirk Hovy, Simone Paolo Ponzetto,
Goran Glava\v{s}
- Abstract summary: Previous work showed that incorporating demographic factors can consistently improve performance for various NLP tasks with traditional NLP models.
We use three common specialization methods proven effective for incorporating external knowledge into pretrained Transformers.
We adapt the language representations for the demographic dimensions of gender and age, using continuous language modeling and dynamic multi-task learning.
- Score: 34.768337465321395
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Demographic factors (e.g., gender or age) shape our language. Previous work
showed that incorporating demographic factors can consistently improve
performance for various NLP tasks with traditional NLP models. In this work, we
investigate whether these previous findings still hold with state-of-the-art
pretrained Transformer-based language models (PLMs). We use three common
specialization methods proven effective for incorporating external knowledge
into pretrained Transformers (e.g., domain-specific or geographic knowledge).
We adapt the language representations for the demographic dimensions of gender
and age, using continuous language modeling and dynamic multi-task learning for
adaptation, where we couple language modeling objectives with the prediction of
demographic classes. Our results, when employing a multilingual PLM, show
substantial gains in task performance across four languages (English, German,
French, and Danish), which is consistent with the results of previous work.
However, controlling for confounding factors - primarily domain and language
proficiency of Transformer-based PLMs - shows that downstream performance gains
from our demographic adaptation do not actually stem from demographic
knowledge. Our results indicate that demographic specialization of PLMs, while
holding promise for positive societal impact, still represents an unsolved
problem for (modern) NLP.
Related papers
- Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do.
We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models.
Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z) - L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs)
We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods.
In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z) - Sensitivity, Performance, Robustness: Deconstructing the Effect of
Sociodemographic Prompting [64.80538055623842]
sociodemographic prompting is a technique that steers the output of prompt-based models towards answers that humans with specific sociodemographic profiles would give.
We show that sociodemographic information affects model predictions and can be beneficial for improving zero-shot learning in subjective NLP tasks.
arXiv Detail & Related papers (2023-09-13T15:42:06Z) - Benchmarking Large Language Model Capabilities for Conditional
Generation [15.437176676169997]
We discuss how to adapt existing application-specific generation benchmarks to PLMs.
We show that PLMs differ in their applicability to different data regimes and their generalization to multiple languages.
arXiv Detail & Related papers (2023-06-29T08:59:40Z) - HumBEL: A Human-in-the-Loop Approach for Evaluating Demographic Factors
of Language Models in Human-Machine Conversations [26.59671463642373]
We consider how demographic factors in LM language skills can be measured to determine compatibility with a target demographic.
We suggest clinical techniques from Speech Language Pathology, which has norms for acquisition of language skills in humans.
We conduct evaluation with a domain expert (i.e., a clinically licensed speech language pathologist) and also propose automated techniques to complement clinical evaluation at scale.
arXiv Detail & Related papers (2023-05-23T16:15:24Z) - A Survey of Large Language Models [81.06947636926638]
Language modeling has been widely studied for language understanding and generation in the past two decades.
Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora.
To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.
arXiv Detail & Related papers (2023-03-31T17:28:46Z) - On the Limitations of Sociodemographic Adaptation with Transformers [34.768337465321395]
Sociodemographic factors (e.g., gender or age) shape our language.
Previous work showed that incorporating specific sociodemographic factors can consistently improve performance for various NLP tasks.
We use three common specialization methods proven effective for incorporating external knowledge into pretrained Transformers.
arXiv Detail & Related papers (2022-08-01T17:58:02Z) - Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings.
We demonstrate that this framework enables effective generalization across different environments.
For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z) - Detecting ESG topics using domain-specific language models and data
augmentation approaches [3.3332986505989446]
Natural language processing tasks in the financial domain remain challenging due to paucity of appropriately labelled data.
Here, we investigate two approaches that may help to mitigate these issues.
Firstly, we experiment with further language model pre-training using large amounts of in-domain data from business and financial news.
We then apply augmentation approaches to increase the size of our dataset for model fine-tuning.
arXiv Detail & Related papers (2020-10-16T11:20:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.