SocioProbe: What, When, and Where Language Models Learn about
Sociodemographics
- URL: http://arxiv.org/abs/2211.04281v1
- Date: Tue, 8 Nov 2022 14:37:45 GMT
- Title: SocioProbe: What, When, and Where Language Models Learn about
Sociodemographics
- Authors: Anne Lauscher, Federico Bianchi, Samuel Bowman, and Dirk Hovy
- Abstract summary: We investigate the sociodemographic knowledge of pre-trained language models (PLMs) on multiple English data sets.
Our results show that PLMs do encode these sociodemographics, and that this knowledge is sometimes spread across the layers of some of the tested PLMs.
Our overall results indicate that sociodemographic knowledge is still a major challenge for NLP.
- Score: 31.040600510190732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models (PLMs) have outperformed other NLP models on a
wide range of tasks. Opting for a more thorough understanding of their
capabilities and inner workings, researchers have established the extend to
which they capture lower-level knowledge like grammaticality, and mid-level
semantic knowledge like factual understanding. However, there is still little
understanding of their knowledge of higher-level aspects of language. In
particular, despite the importance of sociodemographic aspects in shaping our
language, the questions of whether, where, and how PLMs encode these aspects,
e.g., gender or age, is still unexplored. We address this research gap by
probing the sociodemographic knowledge of different single-GPU PLMs on multiple
English data sets via traditional classifier probing and information-theoretic
minimum description length probing. Our results show that PLMs do encode these
sociodemographics, and that this knowledge is sometimes spread across the
layers of some of the tested PLMs. We further conduct a multilingual analysis
and investigate the effect of supplementary training to further explore to what
extent, where, and with what amount of pre-training data the knowledge is
encoded. Our overall results indicate that sociodemographic knowledge is still
a major challenge for NLP. PLMs require large amounts of pre-training data to
acquire the knowledge and models that excel in general language understanding
do not seem to own more knowledge about these aspects.
Related papers
- FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks.
We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z) - History, Development, and Principles of Large Language Models-An Introductory Survey [15.875687167037206]
Language models serve as a cornerstone in natural language processing (NLP)
Over extensive research spanning decades, language modeling has progressed from initial statistical language models (SLMs) to the contemporary landscape of large language models (LLMs)
arXiv Detail & Related papers (2024-02-10T01:18:15Z) - Spoken Language Intelligence of Large Language Models for Language
Learning [3.5924382852350902]
We focus on evaluating the efficacy of large language models (LLMs) in the realm of education.
We introduce a new multiple-choice question dataset to evaluate the effectiveness of LLMs in the aforementioned scenarios.
We also investigate the influence of various prompting techniques such as zero- and few-shot method.
We find that models of different sizes have good understanding of concepts in phonetics, phonology, and second language acquisition, but show limitations in reasoning for real-world problems.
arXiv Detail & Related papers (2023-08-28T12:47:41Z) - Do Large Language Models Know What They Don't Know? [74.65014158544011]
Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks.
Despite their vast knowledge, LLMs are still limited by the amount of information they can accommodate and comprehend.
This study aims to evaluate LLMs' self-knowledge by assessing their ability to identify unanswerable or unknowable questions.
arXiv Detail & Related papers (2023-05-29T15:30:13Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs)
For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge.
The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z) - Adapters for Enhanced Modeling of Multilingual Knowledge and Text [54.02078328453149]
Language models have been extended to multilingual language models (MLLMs)
Knowledge graphs contain facts in an explicit triple format, which require careful curation and are only available in a few high-resource languages.
We propose to enhance MLLMs with knowledge from multilingual knowledge graphs (MLKGs) so as to tackle language and knowledge graph tasks across many languages.
arXiv Detail & Related papers (2022-10-24T21:33:42Z) - Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey [8.427521246916463]
Pretrained Language Models (PLM) have established a new paradigm through learning informative representations on large-scale text corpus.
This new paradigm has revolutionized the entire field of natural language processing, and set the new state-of-the-art performance for a wide variety of NLP tasks.
To address this issue, integrating knowledge into PLMs have recently become a very active research area and a variety of approaches have been developed.
arXiv Detail & Related papers (2021-10-16T03:27:56Z) - CoLAKE: Contextualized Language and Knowledge Embedding [81.90416952762803]
We propose the Contextualized Language and Knowledge Embedding (CoLAKE)
CoLAKE jointly learns contextualized representation for both language and knowledge with the extended objective.
We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks.
arXiv Detail & Related papers (2020-10-01T11:39:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.