Related papers: Data Mesh: a Systematic Gray Literature Review

Data Mesh: a Systematic Gray Literature Review

URL: http://arxiv.org/abs/2304.01062v3
Date: Wed, 7 Aug 2024 14:20:27 GMT
Title: Data Mesh: a Systematic Gray Literature Review
Authors: Abel Goedegebuure, Indika Kumara, Stefan Driessen, Dario Di Nucci, Geert Monsieur, Willem-jan van den Heuvel, Damian Andrew Tamburri,
Abstract summary: Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks. We systematically collected, analyzed, and synthesized 114 industrial gray literature articles. The review provides insights into practitioners' perspectives on the four key principles of data mesh.
Score: 3.038477115588261
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has picked the practitioners' interest, and there is considerable gray literature on it. At the same time, we observe a lack of academic attempts at defining and building upon the concept. Hence, in this article, we aim to start from the foundations and characterize the data mesh architecture regarding its design principles, architectural components, capabilities, and organizational roles. We systematically collected, analyzed, and synthesized 114 industrial gray literature articles. The review provides insights into practitioners' perspectives on the four key principles of data mesh: data as a product, domain ownership of data, self-serve data platform, and federated computational governance. Moreover, due to the comparability of data mesh and SOA (service-oriented architecture), we mapped the findings from the gray literature into the reference architectures from the SOA academic literature to create the reference architectures for describing three key dimensions of data mesh: organization of capabilities and roles, development, and runtime. Finally, we discuss open research issues in data mesh, partially based on the findings from the gray literature.

Related papers

Semi-Automated Design of Data-Intensive Architectures [49.1574468325115]
This paper introduces a development methodology for data-intensive architectures. It guides architects in (i) designing a suitable architecture for their specific application scenario, and (ii) selecting an appropriate set of concrete systems to implement the application. We show that the description languages we adopt can capture the key aspects of data-intensive architectures proposed by researchers and practitioners.
arXiv Detail & Related papers (2025-03-21T16:01:11Z)
A Survey of Model Architectures in Information Retrieval [64.75808744228067]
We focus on two key aspects: backbone models for feature extraction and end-to-end system architectures for relevance estimation. We trace the development from traditional term-based methods to modern neural approaches, particularly highlighting the impact of transformer-based models and subsequent large language models (LLMs) We conclude by discussing emerging challenges and future directions, including architectural optimizations for performance and scalability, handling of multimodal, multilingual data, and adaptation to novel application domains beyond traditional search paradigms.
arXiv Detail & Related papers (2025-02-20T18:42:58Z)
BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data [61.936320820180875]
Large language models (LLMs) have become increasingly pivotal across various domains. BabelBench is an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution. Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement.
arXiv Detail & Related papers (2024-10-01T15:11:24Z)
Redefining Data-Centric Design: A New Approach with a Domain Model and Core Data Ontology for Computational Systems [2.872069347343959]
This paper presents an innovative data-centric paradigm for designing computational systems by introducing a new informatics domain model. The proposed model moves away from the conventional node-centric framework and focuses on data-centric categorization, using a multimodal approach that incorporates objects, events, concepts, and actions.
arXiv Detail & Related papers (2024-09-01T22:34:12Z)
Empowering Data Mesh with Federated Learning [5.087058648342379]
New paradigm, Data Mesh, treats domains as a first-class concern by distributing the data ownership from the central team to each data domain. Many multi-million dollar organizations like Paypal, Netflix, and Zalando have already transformed their data analysis pipelines based on this new architecture. We introduce a pioneering approach that incorporates Federated Learning into Data Mesh.
arXiv Detail & Related papers (2024-03-26T17:10:15Z)
Unsupervised Graph Neural Architecture Search with Disentangled Self-supervision [51.88848982611515]
Unsupervised graph neural architecture search remains unexplored in the literature. We propose a novel Disentangled Self-supervised Graph Neural Architecture Search model. Our model is able to achieve state-of-the-art performance against several baseline methods in an unsupervised manner.
arXiv Detail & Related papers (2024-03-08T05:23:55Z)
Incremental hierarchical text clustering methods: a review [49.32130498861987]
This study aims to analyze various hierarchical and incremental clustering techniques. The main contribution of this research is the organization and comparison of the techniques used by studies published between 2010 and 2018 that aimed to texts documents clustering.
arXiv Detail & Related papers (2023-12-12T22:27:29Z)
Serving Deep Learning Model in Relational Databases [70.53282490832189]
Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains. We highlight three pivotal paradigms: The state-of-the-art DL-centric architecture offloads DL computations to dedicated DL frameworks. The potential UDF-centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the relational database management system (RDBMS)
arXiv Detail & Related papers (2023-10-07T06:01:35Z)
Geometric Deep Learning for Structure-Based Drug Design: A Survey [83.87489798671155]
Structure-based drug design (SBDD) leverages the three-dimensional geometry of proteins to identify potential drug candidates. Recent advancements in geometric deep learning, which effectively integrate and process 3D geometric data, have significantly propelled the field forward.
arXiv Detail & Related papers (2023-06-20T14:21:58Z)
CateCom: a practical data-centric approach to categorization of computational models [77.34726150561087]
We present an effort aimed at organizing the landscape of physics-based and data-driven computational models. We apply object-oriented design concepts and outline the foundations of an open-source collaborative framework.
arXiv Detail & Related papers (2021-09-28T02:59:40Z)
IFCNet: A Benchmark Dataset for IFC Entity Classification [0.0]
This work presents IFCNet, a dataset of single-entity IFC files spanning a broad range of IFC classes containing both geometric and semantic information. Using only the geometric information of objects, the experiments show that three different deep learning models are able to achieve good classification performance.
arXiv Detail & Related papers (2021-06-17T17:59:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.