Inspecting the concept knowledge graph encoded by modern language models
- URL: http://arxiv.org/abs/2105.13471v1
- Date: Thu, 27 May 2021 22:19:19 GMT
- Title: Inspecting the concept knowledge graph encoded by modern language models
- Authors: Carlos Aspillaga, Marcelo Mendoza, Alvaro Soto
- Abstract summary: We study the underlying knowledge encoded by nine of the most influential language models of the last years.
Our results reveal that all the models encode this knowledge, but suffer from several inaccuracies.
We conduct a systematic evaluation to discover specific factors that explain why some concepts are challenging.
- Score: 5.2117321443066364
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The field of natural language understanding has experienced exponential
progress in the last few years, with impressive results in several tasks. This
success has motivated researchers to study the underlying knowledge encoded by
these models. Despite this, attempts to understand their semantic capabilities
have not been successful, often leading to non-conclusive, or contradictory
conclusions among different works. Via a probing classifier, we extract the
underlying knowledge graph of nine of the most influential language models of
the last years, including word embeddings, text generators, and context
encoders. This probe is based on concept relatedness, grounded on WordNet. Our
results reveal that all the models encode this knowledge, but suffer from
several inaccuracies. Furthermore, we show that the different architectures and
training strategies lead to different model biases. We conduct a systematic
evaluation to discover specific factors that explain why some concepts are
challenging. We hope our insights will motivate the development of models that
capture concepts more precisely.
Related papers
- Compositional Generalization with Grounded Language Models [9.96679221246835]
Grounded language models use external sources of information, such as knowledge graphs, to meet some of the general challenges associated with pre-training.
We develop a procedure for generating natural language questions paired with knowledge graphs that targets different aspects of compositionality.
arXiv Detail & Related papers (2024-06-07T14:56:51Z) - Learning Interpretable Concepts: Unifying Causal Representation Learning
and Foundation Models [51.43538150982291]
We study how to learn human-interpretable concepts from data.
Weaving together ideas from both fields, we show that concepts can be provably recovered from diverse data.
arXiv Detail & Related papers (2024-02-14T15:23:59Z) - CommonsenseVIS: Visualizing and Understanding Commonsense Reasoning
Capabilities of Natural Language Models [30.63276809199399]
We present CommonsenseVIS, a visual explanatory system that utilizes external commonsense knowledge bases to contextualize model behavior for commonsense question-answering.
Our system features multi-level visualization and interactive model probing and editing for different concepts and their underlying relations.
arXiv Detail & Related papers (2023-07-23T17:16:13Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Discovering Latent Concepts Learned in BERT [21.760620298330235]
We study what latent concepts exist in the pre-trained BERT model.
We also release a novel BERT ConceptNet dataset (BCN) consisting of 174 concept labels and 1M annotated instances.
arXiv Detail & Related papers (2022-05-15T09:45:34Z) - Analyzing the Limits of Self-Supervision in Handling Bias in Language [52.26068057260399]
We evaluate how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing.
Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
arXiv Detail & Related papers (2021-12-16T05:36:08Z) - Interpretable Deep Learning: Interpretations, Interpretability,
Trustworthiness, and Beyond [49.93153180169685]
We introduce and clarify two basic concepts-interpretations and interpretability-that people usually get confused.
We elaborate the design of several recent interpretation algorithms, from different perspectives, through proposing a new taxonomy.
We summarize the existing work in evaluating models' interpretability using "trustworthy" interpretation algorithms.
arXiv Detail & Related papers (2021-03-19T08:40:30Z) - Knowledge-driven Data Construction for Zero-shot Evaluation in
Commonsense Question Answering [80.60605604261416]
We propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks.
We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks.
We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
arXiv Detail & Related papers (2020-11-07T22:52:21Z) - Language Generation with Multi-Hop Reasoning on Commonsense Knowledge
Graph [124.45799297285083]
We argue that exploiting both the structural and semantic information of the knowledge graph facilitates commonsense-aware text generation.
We propose Generation with Multi-Hop Reasoning Flow (GRF) that enables pre-trained models with dynamic multi-hop reasoning on multi-relational paths extracted from the external commonsense knowledge graph.
arXiv Detail & Related papers (2020-09-24T13:55:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.