BanglaAutoKG: Automatic Bangla Knowledge Graph Construction with Semantic Neural Graph Filtering
- URL: http://arxiv.org/abs/2404.03528v3
- Date: Wed, 5 Jun 2024 13:39:56 GMT
- Title: BanglaAutoKG: Automatic Bangla Knowledge Graph Construction with Semantic Neural Graph Filtering
- Authors: Azmine Toushik Wasi, Taki Hasan Rafi, Raima Islam, Dong-Kyu Chae,
- Abstract summary: Knowledge Graphs (KGs) have proven essential in information processing and reasoning applications.
Despite being widely used globally, Bangla is relatively underrepresented in KGs due to a lack of comprehensive datasets.
We propose BanglaAutoKG, a pioneering framework that is able to automatically construct Bengali KGs from any Bangla text.
- Score: 6.05977559550463
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge Graphs (KGs) have proven essential in information processing and reasoning applications because they link related entities and give context-rich information, supporting efficient information retrieval and knowledge discovery; presenting information flow in a very effective manner. Despite being widely used globally, Bangla is relatively underrepresented in KGs due to a lack of comprehensive datasets, encoders, NER (named entity recognition) models, POS (part-of-speech) taggers, and lemmatizers, hindering efficient information processing and reasoning applications in the language. Addressing the KG scarcity in Bengali, we propose BanglaAutoKG, a pioneering framework that is able to automatically construct Bengali KGs from any Bangla text. We utilize multilingual LLMs to understand various languages and correlate entities and relations universally. By employing a translation dictionary to identify English equivalents and extracting word features from pre-trained BERT models, we construct the foundational KG. To reduce noise and align word embeddings with our goal, we employ graph-based polynomial filters. Lastly, we implement a GNN-based semantic filter, which elevates contextual understanding and trims unnecessary edges, culminating in the formation of the definitive KG. Empirical findings and case studies demonstrate the universal effectiveness of our model, capable of autonomously constructing semantically enriched KGs from any text.
Related papers
- Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering [87.67177556994525]
We propose a training-free method called Generate-on-Graph (GoG) to generate new factual triples while exploring Knowledge Graphs (KGs)
GoG performs reasoning through a Thinking-Searching-Generating framework, which treats LLM as both Agent and KG in IKGQA.
arXiv Detail & Related papers (2024-04-23T04:47:22Z) - FedMKGC: Privacy-Preserving Federated Multilingual Knowledge Graph
Completion [21.4302940596294]
Knowledge graph completion (KGC) aims to predict missing facts in knowledge graphs (KGs)
Previous methods that rely on transferring raw data among KGs raise privacy concerns.
We propose a new federated learning framework that implicitly aggregates knowledge from multiple KGs without demanding raw data exchange and entity alignment.
arXiv Detail & Related papers (2023-12-17T08:09:27Z) - KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using
Large Language Models [18.20425100517317]
We propose KG-GPT, a framework leveraging large language models for tasks employing knowledge graphs.
KG-GPT comprises three steps: Sentence, Graph Retrieval, and Inference, each aimed at partitioning sentences, retrieving relevant graph components, and deriving logical conclusions.
We evaluate KG-GPT using KG-based fact verification and KGQA benchmarks, with the model showing competitive and robust performance, even outperforming several fully-supervised models.
arXiv Detail & Related papers (2023-10-17T12:51:35Z) - Knowledge Graph-Augmented Language Models for Knowledge-Grounded
Dialogue Generation [58.65698688443091]
We propose SUbgraph Retrieval-augmented GEneration (SURGE), a framework for generating context-relevant and knowledge-grounded dialogues with Knowledge Graphs (KGs)
Our framework first retrieves the relevant subgraph from the KG, and then enforces consistency across facts by perturbing their word embeddings conditioned by the retrieved subgraph.
We validate our SURGE framework on OpendialKG and KOMODIS datasets, showing that it generates high-quality dialogues that faithfully reflect the knowledge from KG.
arXiv Detail & Related papers (2023-05-30T08:36:45Z) - Deep Bidirectional Language-Knowledge Graph Pretraining [159.9645181522436]
DRAGON is a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale.
Our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities.
arXiv Detail & Related papers (2022-10-17T18:02:52Z) - BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from
Pretrained Language Models [65.51390418485207]
We propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs.
With minimal input of a relation definition, the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge.
We deploy the approach to harvest KGs of over 400 new relations from different LMs.
arXiv Detail & Related papers (2022-06-28T19:46:29Z) - Endowing Language Models with Multimodal Knowledge Graph Representations [47.22480859519051]
We use the recently released VisualSem KG as our external knowledge repository.
We retrieve entities from the KG and use their multimodal representations to improve downstream task performance.
arXiv Detail & Related papers (2022-06-27T10:10:42Z) - KELM: Knowledge Enhanced Pre-Trained Language Representations with
Message Passing on Hierarchical Relational Graphs [26.557447199727758]
We propose a novel knowledge-aware language model framework based on fine-tuning process.
Our model can efficiently incorporate world knowledge from KGs into existing language models such as BERT.
arXiv Detail & Related papers (2021-09-09T12:39:17Z) - Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced
Language Model Pre-training [22.534866015730664]
We verbalize the entire English Wikidata KG.
We show that verbalizing a comprehensive, encyclopedic KG like Wikidata can be used to integrate structured KGs and natural language corpora.
arXiv Detail & Related papers (2020-10-23T22:14:50Z) - JAKET: Joint Pre-training of Knowledge Graph and Language Understanding [73.43768772121985]
We propose a novel joint pre-training framework, JAKET, to model both the knowledge graph and language.
The knowledge module and language module provide essential information to mutually assist each other.
Our design enables the pre-trained model to easily adapt to unseen knowledge graphs in new domains.
arXiv Detail & Related papers (2020-10-02T05:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.