Rethinking Language Models as Symbolic Knowledge Graphs
- URL: http://arxiv.org/abs/2308.13676v1
- Date: Fri, 25 Aug 2023 21:25:08 GMT
- Title: Rethinking Language Models as Symbolic Knowledge Graphs
- Authors: Vishwas Mruthyunjaya, Pouya Pezeshkpour, Estevam Hruschka, Nikita
Bhutani
- Abstract summary: Symbolic knowledge graphs (KGs) play a pivotal role in knowledge-centric applications such as search, question answering and recommendation.
We construct nine qualitative benchmarks that encompass a spectrum of attributes including symmetry, asymmetry, hierarchy, bidirectionality, compositionality, paths, entity-centricity, bias and ambiguity.
- Score: 7.192286645674803
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Symbolic knowledge graphs (KGs) play a pivotal role in knowledge-centric
applications such as search, question answering and recommendation. As
contemporary language models (LMs) trained on extensive textual data have
gained prominence, researchers have extensively explored whether the parametric
knowledge within these models can match up to that present in knowledge graphs.
Various methodologies have indicated that enhancing the size of the model or
the volume of training data enhances its capacity to retrieve symbolic
knowledge, often with minimal or no human supervision. Despite these
advancements, there is a void in comprehensively evaluating whether LMs can
encompass the intricate topological and semantic attributes of KGs, attributes
crucial for reasoning processes. In this work, we provide an exhaustive
evaluation of language models of varying sizes and capabilities. We construct
nine qualitative benchmarks that encompass a spectrum of attributes including
symmetry, asymmetry, hierarchy, bidirectionality, compositionality, paths,
entity-centricity, bias and ambiguity. Additionally, we propose novel
evaluation metrics tailored for each of these attributes. Our extensive
evaluation of various LMs shows that while these models exhibit considerable
potential in recalling factual information, their ability to capture intricate
topological and semantic traits of KGs remains significantly constrained. We
note that our proposed evaluation metrics are more reliable in evaluating these
abilities than the existing metrics. Lastly, some of our benchmarks challenge
the common notion that larger LMs (e.g., GPT-4) universally outshine their
smaller counterparts (e.g., BERT).
Related papers
- Why do you cite? An investigation on citation intents and decision-making classification processes [1.7812428873698407]
This study emphasizes the importance of trustfully classifying citation intents.
We present a study utilizing advanced Ensemble Strategies for Citation Intent Classification (CIC)
One of our models sets as a new state-of-the-art (SOTA) with an 89.46% Macro-F1 score on the SciCite benchmark.
arXiv Detail & Related papers (2024-07-18T09:29:33Z) - MechGPT, a language-based strategy for mechanics and materials modeling
that connects knowledge across scales, disciplines and modalities [0.0]
We use a Large Language Model (LLM) to distill question-answer pairs from raw sources followed by fine-tuning.
The resulting MechGPT LLM foundation model is used in a series of computational experiments to explore its capacity for knowledge retrieval, various language tasks, hypothesis generation, and connecting knowledge across disparate areas.
arXiv Detail & Related papers (2023-10-16T14:29:35Z) - Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution [48.86322922826514]
This paper defines a new task of Knowledge-aware Language Model Attribution (KaLMA)
First, we extend attribution source from unstructured texts to Knowledge Graph (KG), whose rich structures benefit both the attribution performance and working scenarios.
Second, we propose a new Conscious Incompetence" setting considering the incomplete knowledge repository.
Third, we propose a comprehensive automatic evaluation metric encompassing text quality, citation quality, and text citation alignment.
arXiv Detail & Related papers (2023-10-09T11:45:59Z) - Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level
Vision [85.6008224440157]
Multi-modality Large Language Models (MLLMs) have catalyzed a shift in computer vision from specialized models to general-purpose foundation models.
We present Q-Bench, a holistic benchmark crafted to evaluate potential abilities of MLLMs on three realms: low-level visual perception, low-level visual description, and overall visual quality assessment.
arXiv Detail & Related papers (2023-09-25T14:43:43Z) - KoLA: Carefully Benchmarking World Knowledge of Large Language Models [87.96683299084788]
We construct a Knowledge-oriented LLM Assessment benchmark (KoLA)
We mimic human cognition to form a four-level taxonomy of knowledge-related abilities, covering $19$ tasks.
We use both Wikipedia, a corpus prevalently pre-trained by LLMs, along with continuously collected emerging corpora, to evaluate the capacity to handle unseen data and evolving knowledge.
arXiv Detail & Related papers (2023-06-15T17:20:46Z) - Sem@$K$: Is my knowledge graph embedding model semantic-aware? [1.8024397171920883]
We extend our previously introduced metric Sem@K that measures the capability of models to predict valid entities w.r.t. domain and range constraints.
Our experiments show that Sem@K provides a new perspective on KGEM quality.
Some KGEMs are inherently better than others, but this semantic superiority is not indicative of their performance w.r.t. rank-based metrics.
arXiv Detail & Related papers (2023-01-13T15:06:47Z) - BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from
Pretrained Language Models [65.51390418485207]
We propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs.
With minimal input of a relation definition, the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge.
We deploy the approach to harvest KGs of over 400 new relations from different LMs.
arXiv Detail & Related papers (2022-06-28T19:46:29Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - CogME: A Cognition-Inspired Multi-Dimensional Evaluation Metric for Story Understanding [19.113385429326808]
We introduce CogME, a cognition-inspired, multi-dimensional evaluation metric designed for AI models focusing on story understanding.
We argue the need for metrics based on understanding the nature of tasks and designed to align closely with human cognitive processes.
This approach provides insights beyond traditional overall scores and paves the way for more sophisticated AI development targeting higher cognitive functions.
arXiv Detail & Related papers (2021-07-21T02:33:37Z) - COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs [82.8453695903687]
We show that manually constructed commonsense knowledge graphs (CSKGs) will never achieve the coverage necessary to be applicable in all situations encountered by NLP agents.
We propose ATOMIC 2020, a new CSKG of general-purpose commonsense knowledge containing knowledge that is not readily available in pretrained language models.
We evaluate its properties in comparison with other leading CSKGs, performing the first large-scale pairwise study of commonsense knowledge resources.
arXiv Detail & Related papers (2020-10-12T18:27:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.