Semantic Annotation and Querying Framework based on Semi-structured
Ayurvedic Text
- URL: http://arxiv.org/abs/2202.00216v1
- Date: Tue, 1 Feb 2022 04:33:13 GMT
- Title: Semantic Annotation and Querying Framework based on Semi-structured
Ayurvedic Text
- Authors: Hrishikesh Terdalkar, Arnab Bhattacharya, Madhulika Dubey, Ramamurthy
S, Bhavna Naneria Singh
- Abstract summary: We describe our efforts on manual annotation of Sanskrit text for the purpose of knowledge graph (KG) creation.
The constructed knowledge graph contains 410 entities and 764 relationships.
The entire system including the dataset is available from https://sanskrit.iitk.ac.in/ayurveda/.
- Score: 4.154846138501937
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge bases (KB) are an important resource in a number of natural
language processing (NLP) and information retrieval (IR) tasks, such as
semantic search, automated question-answering etc. They are also useful for
researchers trying to gain information from a text. Unfortunately, however, the
state-of-the-art in Sanskrit NLP does not yet allow automated construction of
knowledge bases due to unavailability or lack of sufficient accuracy of tools
and methods. Thus, in this work, we describe our efforts on manual annotation
of Sanskrit text for the purpose of knowledge graph (KG) creation. We choose
the chapter Dhanyavarga from Bhavaprakashanighantu of the Ayurvedic text
Bhavaprakasha for annotation. The constructed knowledge graph contains 410
entities and 764 relationships. Since Bhavaprakashanighantu is a technical
glossary text that describes various properties of different substances, we
develop an elaborate ontology to capture the semantics of the entity and
relationship types present in the text. To query the knowledge graph, we design
31 query templates that cover most of the common question patterns. For both
manual annotation and querying, we customize the Sangrahaka framework
previously developed by us. The entire system including the dataset is
available from https://sanskrit.iitk.ac.in/ayurveda/ . We hope that the
knowledge graph that we have created through manual annotation and subsequent
curation will help in development and testing of NLP tools in future as well as
studying of the Bhavaprakasanighantu text.
Related papers
- Building Tamil Treebanks [0.0]
Treebanks are important linguistic resources, which are structured and annotated corpora with rich linguistic annotations.
This paper discusses the creation of Tamil treebanks using three distinct approaches: manual annotation, computational grammars, and machine learning techniques.
arXiv Detail & Related papers (2024-09-23T01:58:50Z) - Sanskrit Knowledge-based Systems: Annotation and Computational Tools [0.12086712057375555]
We address the challenges and opportunities in the development of knowledge systems for Sanskrit.
This research contributes to the preservation, understanding, and utilization of the rich linguistic information embodied in Sanskrit texts.
arXiv Detail & Related papers (2024-06-26T12:00:10Z) - Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering [50.52792174648067]
This initiative seeks to bridge the gap between textual and visual comprehension.
We propose a new multi-task Urdu scene text dataset comprising over 1000 natural scene images.
We provide fine-grained annotations for text instances, addressing the limitations of previous datasets.
arXiv Detail & Related papers (2024-05-21T06:48:26Z) - Harnessing Explanations: LLM-to-LM Interpreter for Enhanced
Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks.
Our method achieves state-of-the-art results on well-established TAG datasets.
Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z) - SanskritShala: A Neural Sanskrit NLP Toolkit with Web-Based Interface
for Pedagogical and Annotation Purposes [13.585440544031584]
We present a neural Sanskrit Natural Language Processing (NLP) toolkit named SanskritShala.
Our systems report state-of-the-art performance on available benchmark datasets for all tasks.
SanskritShala is deployed as a web-based application, which allows a user to get real-time analysis for the given input.
arXiv Detail & Related papers (2023-02-19T09:58:55Z) - Deep Bidirectional Language-Knowledge Graph Pretraining [159.9645181522436]
DRAGON is a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale.
Our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities.
arXiv Detail & Related papers (2022-10-17T18:02:52Z) - Joint Language Semantic and Structure Embedding for Knowledge Graph
Completion [66.15933600765835]
We propose to jointly embed the semantics in the natural language description of the knowledge triplets with their structure information.
Our method embeds knowledge graphs for the completion task via fine-tuning pre-trained language models.
Our experiments on a variety of knowledge graph benchmarks have demonstrated the state-of-the-art performance of our method.
arXiv Detail & Related papers (2022-09-19T02:41:02Z) - FabKG: A Knowledge graph of Manufacturing Science domain utilizing
structured and unconventional unstructured knowledge source [1.2597961235465307]
We develop knowledge graphs based upon entity and relation data for both commercial and educational uses.
We propose a novel crowdsourcing method for KG creation by leveraging student notes.
We have created a knowledge graph containing 65000+ triples using all data sources.
arXiv Detail & Related papers (2022-05-24T02:32:04Z) - Taxonomy Enrichment with Text and Graph Vector Representations [61.814256012166794]
We address the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy.
We present a new method that allows achieving high results on this task with little effort.
We achieve state-of-the-art results across different datasets and provide an in-depth error analysis of mistakes.
arXiv Detail & Related papers (2022-01-21T09:01:12Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z) - Dependently Typed Knowledge Graphs [4.157595789003928]
We show how standardized semantic web technologies (RDF and its query language SPARQL) can be reproduced in a unified manner with dependent type theory.
In addition to providing the basic functionalities of knowledge graphs, dependent types add expressiveness in encoding both entities and queries.
arXiv Detail & Related papers (2020-03-08T14:04:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.