Related papers: An Empirical Study of Pre-trained Language Models in Simple Knowledge Graph Question Answering

An Empirical Study of Pre-trained Language Models in Simple Knowledge Graph Question Answering

URL: http://arxiv.org/abs/2303.10368v1
Date: Sat, 18 Mar 2023 08:57:09 GMT
Title: An Empirical Study of Pre-trained Language Models in Simple Knowledge Graph Question Answering
Authors: Nan Hu, Yike Wu, Guilin Qi, Dehai Min, Jiaoyan Chen, Jeff Z. Pan and Zafar Ali
Abstract summary: Large-scale pre-trained language models (PLMs) have recently achieved great success and become a milestone in natural language processing (NLP) In recent works on knowledge graph question answering (KGQA), BERT or its variants have become necessary in their KGQA models. We compare the performance of different PLMs in KGQA and present three benchmarks for larger-scale KGs.
Score: 28.31377197194905
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large-scale pre-trained language models (PLMs) such as BERT have recently achieved great success and become a milestone in natural language processing (NLP). It is now the consensus of the NLP community to adopt PLMs as the backbone for downstream tasks. In recent works on knowledge graph question answering (KGQA), BERT or its variants have become necessary in their KGQA models. However, there is still a lack of comprehensive research and comparison of the performance of different PLMs in KGQA. To this end, we summarize two basic KGQA frameworks based on PLMs without additional neural network modules to compare the performance of nine PLMs in terms of accuracy and efficiency. In addition, we present three benchmarks for larger-scale KGs based on the popular SimpleQuestions benchmark to investigate the scalability of PLMs. We carefully analyze the results of all PLMs-based KGQA basic frameworks on these benchmarks and two other popular datasets, WebQuestionSP and FreebaseQA, and find that knowledge distillation techniques and knowledge enhancement methods in PLMs are promising for KGQA. Furthermore, we test ChatGPT, which has drawn a great deal of attention in the NLP community, demonstrating its impressive capabilities and limitations in zero-shot KGQA. We have released the code and benchmarks to promote the use of PLMs on KGQA.

Related papers

Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis [89.60263788590893]
Post-training Quantization (PTQ) technique has been extensively adopted for large language models (LLMs) compression. Existing algorithms focus primarily on performance, overlooking the trade-off among model size, performance, and quantization bitwidth. We provide a novel benchmark for LLMs PTQ in this paper.
arXiv Detail & Related papers (2025-02-18T07:35:35Z)
LLM-based Discriminative Reasoning for Knowledge Graph Question Answering [42.277864969014296]
Large language models (LLMs) based on generative pre-trained Transformer have achieved remarkable performance on knowledge graph question-answering (KGQA) tasks. However, LLMs often produce ungrounded subgraph planning or reasoning results in KGQA due to the hallucinatory behavior brought by the generative paradigm. We propose READS to reformulate the KGQA process into discriminative subtasks, which simplifies the search space for each subtask.
arXiv Detail & Related papers (2024-12-17T08:07:16Z)
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions [19.246385485678104]
Large language models (LLMs) are susceptible to being misled by false premise questions (FPQs) We introduce an automated, scalable pipeline to create FPQs based on knowledge graphs (KGs) We present a benchmark, the Knowledge Graph-based False Premise Questions (KG-FPQ), which contains approximately 178k FPQs across three knowledge domains, at six levels of confusability, and in two task formats.
arXiv Detail & Related papers (2024-07-08T12:31:03Z)
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph [83.90988015005934]
Uncertainty quantification is a key element of machine learning applications. We introduce a novel benchmark that implements a collection of state-of-the-art UQ baselines. We conduct a large-scale empirical investigation of UQ and normalization techniques across eleven tasks, identifying the most effective approaches.
arXiv Detail & Related papers (2024-06-21T20:06:31Z)
Leveraging Large Language Models for Concept Graph Recovery and Question Answering in NLP Education [14.908333207564574]
Large Language Models (LLMs) have demonstrated promise in text-generation tasks. This study focuses on concept graph recovery and question-answering (QA) In TutorQA tasks, LLMs achieve up to 26% F1 score enhancement.
arXiv Detail & Related papers (2024-02-22T05:15:27Z)
KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph [134.8631016845467]
We propose an autonomous LLM-based agent framework, called KG-Agent. In KG-Agent, we integrate the LLM, multifunctional toolbox, KG-based executor, and knowledge memory. To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG.
arXiv Detail & Related papers (2024-02-17T02:07:49Z)
ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph [142.42275983201978]
We propose a subgraph-aware self-attention mechanism to imitate the GNN for performing structured reasoning. We also adopt an adaptation tuning strategy to adapt the model parameters with 20,000 subgraphs with synthesized questions. Experiments show that ReasoningLM surpasses state-of-the-art models by a large margin, even with fewer updated parameters and less training data.
arXiv Detail & Related papers (2023-12-30T07:18:54Z)
How to Prune Your Language Model: Recovering Accuracy on the "Sparsity May Cry'' Benchmark [60.72725673114168]
We revisit the question of accurate BERT-pruning during fine-tuning on downstream datasets. We propose a set of general guidelines for successful pruning, even on the challenging SMC benchmark.
arXiv Detail & Related papers (2023-12-21T03:11:30Z)
Systematic Assessment of Factual Knowledge in Large Language Models [48.75961313441549]
This paper proposes a framework to assess the factual knowledge of large language models (LLMs) by leveraging knowledge graphs (KGs) Our framework automatically generates a set of questions and expected answers from the facts stored in a given KG, and then evaluates the accuracy of LLMs in answering these questions.
arXiv Detail & Related papers (2023-10-18T00:20:50Z)
Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models [76.52997424694767]
Keyphrase Generation (KPG) is a longstanding task in NLP with widespread applications. Seq2seq pre-trained language models (PLMs) have ushered in a transformative era for KPG, yielding promising performance improvements. This paper undertakes a systematic analysis of the influence of model selection and decoding strategies on PLM-based KPG.
arXiv Detail & Related papers (2023-10-10T07:34:45Z)
FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering [16.88132219032486]
We introduce FlexKBQA to mitigate the burden associated with manual annotation. We leverage Large Language Models (LLMs) as program translators for addressing the challenges inherent in the few-shot KBQA task. Specifically, FlexKBQA leverages automated algorithms to sample diverse programs, such as SPARQL queries, from the knowledge base. We observe that under the few-shot even the more challenging zero-shot scenarios, FlexKBQA achieves impressive results with a few annotations.
arXiv Detail & Related papers (2023-08-23T11:00:36Z)
Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study [76.52997424694767]
We present an in-depth empirical study of keyphrase extraction and keyphrase generation using pre-trained language models. We show that PLMs have competitive high-resource performance and state-of-the-art low-resource performance. Further results show that in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models.
arXiv Detail & Related papers (2022-12-20T13:20:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.