Related papers: A Document-based Knowledge Discovery with Microservices Architecture

A Document-based Knowledge Discovery with Microservices Architecture

URL: http://arxiv.org/abs/2407.00053v1
Date: Thu, 13 Jun 2024 09:28:31 GMT
Title: A Document-based Knowledge Discovery with Microservices Architecture
Authors: Habtom Kahsay Gidey, Mario Kesseler, Patrick Stangl, Peter Hillmann, Andreas Karcher,
Abstract summary: We point out the key challenges in the context of knowledge discovery and present an approach to addressing these using a database architecture. Our solution led to a conceptual design focusing on keyword extraction, calculation of documents, similarity in natural language, and programming language independent provision of the extracted information.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The first step towards digitalization within organizations lies in digitization - the conversion of analog data into digitally stored data. This basic step is the prerequisite for all following activities like the digitalization of processes or the servitization of products or offerings. However, digitization itself often leads to 'data-rich' but 'knowledge-poor' material. Knowledge discovery and knowledge extraction as approaches try to increase the usefulness of digitized data. In this paper, we point out the key challenges in the context of knowledge discovery and present an approach to addressing these using a microservices architecture. Our solution led to a conceptual design focusing on keyword extraction, similarity calculation of documents, database queries in natural language, and programming language independent provision of the extracted information. In addition, the conceptual design provides referential design guidelines for integrating processes and applications for semi-automatic learning, editing, and visualization of ontologies. The concept also uses a microservices architecture to address non-functional requirements, such as scalability and resilience. The evaluation of the specified requirements is performed using a demonstrator that implements the concept. Furthermore, this modern approach is used in the German patent office in an extended version.

Related papers

Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents [49.88380945341337]
We decompose a web agent's capabilities into two essential stages: knowledge content learning and cognitive processes.<n>To facilitate knowledge acquisition, we construct the Web-CogDataset, a structured resource curated from 14 real-world websites.<n>Building on this foundation, we operationalize these processes through a novel knowledge-driven Chain-of-Thought (CoT) reasoning framework.
arXiv Detail & Related papers (2025-08-03T17:17:52Z)
Semi-Automated Design of Data-Intensive Architectures [49.1574468325115]
This paper introduces a development methodology for data-intensive architectures. It guides architects in (i) designing a suitable architecture for their specific application scenario, and (ii) selecting an appropriate set of concrete systems to implement the application. We show that the description languages we adopt can capture the key aspects of data-intensive architectures proposed by researchers and practitioners.
arXiv Detail & Related papers (2025-03-21T16:01:11Z)
Customized Information and Domain-centric Knowledge Graph Construction with Large Language Models [0.0]
We propose a novel approach based on knowledge graphs to provide timely access to structured information. Our framework encompasses a text mining process, which includes information retrieval, keyphrase extraction, semantic network creation, and topic map visualization. We apply our methodology to the domain of automotive electrical systems to demonstrate the approach, which is scalable.
arXiv Detail & Related papers (2024-09-30T07:08:28Z)
Procedure Model for Building Knowledge Graphs for Industry Applications [0.0]
The graph-based integration of previously unconnected information with domain knowledge provides new insights. This paper presents a practical step-by-step procedure model for building an RDF knowledge graph.
arXiv Detail & Related papers (2024-09-20T11:46:37Z)
On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey [82.49623756124357]
Zero-shot image recognition (ZSIR) aims to recognize and reason in unseen domains by learning generalized knowledge from limited data. This paper thoroughly investigates recent advances in element-wise ZSIR and provides a basis for its future development.
arXiv Detail & Related papers (2024-08-09T05:49:21Z)
Information Extraction from Unstructured data using Augmented-AI and Computer Vision [0.0]
This paper presents a framework for information extraction that combines Augmented Intelligence (A2I) with computer vision and natural language processing techniques.<n>Our approach addresses the limitations of conventional methods by leveraging deep learning architectures for object detection.<n>The proposed methodology demonstrates improved accuracy and efficiency in extracting structured information from diverse document formats.
arXiv Detail & Related papers (2023-12-15T15:27:41Z)
Data Efficient Training of a U-Net Based Architecture for Structured Documents Localization [0.0]
We propose SDL-Net: a novel U-Net like encoder-decoder architecture for the localization of structured documents. Our approach allows pre-training the encoder of SDL-Net on a generic dataset containing samples of various document classes.
arXiv Detail & Related papers (2023-10-02T07:05:19Z)
Model-Driven Engineering Method to Support the Formalization of Machine Learning using SysML [0.0]
This work introduces a method supporting the collaborative definition of machine learning tasks by leveraging model-based engineering. The method supports the identification and integration of various data sources, the required definition of semantic connections between data attributes, and the definition of data processing steps.
arXiv Detail & Related papers (2023-07-10T11:33:46Z)
Enterprise Model Library for Business-IT-Alignment [0.0]
This work is the reference architecture of a repository for models with function of reuse. It includes the design of the data structure for filing, the processes for administration and possibilities for usage.
arXiv Detail & Related papers (2022-11-21T11:36:54Z)
What and How of Machine Learning Transparency: Building Bespoke Explainability Tools with Interoperable Algorithmic Components [77.87794937143511]
This paper introduces a collection of hands-on training materials for explaining data-driven predictive models. These resources cover the three core building blocks of this technique: interpretable representation composition, data sampling and explanation generation.
arXiv Detail & Related papers (2022-09-08T13:33:25Z)
Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration [75.47708732473586]
We propose a layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents. LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents. Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about.
arXiv Detail & Related papers (2022-07-14T07:59:45Z)
Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation. The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
CateCom: a practical data-centric approach to categorization of computational models [77.34726150561087]
We present an effort aimed at organizing the landscape of physics-based and data-driven computational models. We apply object-oriented design concepts and outline the foundations of an open-source collaborative framework.
arXiv Detail & Related papers (2021-09-28T02:59:40Z)
A Privacy-Preserving Distributed Architecture for Deep-Learning-as-a-Service [68.84245063902908]
This paper introduces a novel distributed architecture for deep-learning-as-a-service. It is able to preserve the user sensitive data while providing Cloud-based machine and deep learning services.
arXiv Detail & Related papers (2020-03-30T15:12:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.