Data Architecture for Digital Object Space Management Service (DOSM)
using DAT
- URL: http://arxiv.org/abs/2306.12909v3
- Date: Sun, 23 Jul 2023 21:08:44 GMT
- Title: Data Architecture for Digital Object Space Management Service (DOSM)
using DAT
- Authors: Moamin Abughazala, Henry Muccini
- Abstract summary: This work focuses on describing the movement of data, data formats, data location, data processing (batch or real-time), data storage technologies, and main operations on the data.
Data architecture is a complex task that involves describing the flow of data from its source to its destination.
- Score: 1.8945921149936187
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Internet of Things (IoT) data and social media data are two of the
fastest-growing data segments. Having high-quality data is crucial for making
informed business decisions. The strategic process of leveraging insights from
data is known as data-driven decision-making. To achieve this, it is necessary
to collect, store, analyze, and protect data in the best ways possible. Data
architecture is a complex task that involves describing the flow of data from
its source to its destination and creating a blueprint for managing the data to
meet business needs for information. In this paper, we utilize the Data
Architecture Tool (DAT) to model data for Digital Space Management Service,
which was developed as part of the VASARI project. This work focuses on
describing the movement of data, data formats, data location, data processing
(batch or real-time), data storage technologies, and main operations on the
data.
Related papers
- Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models [79.65071553905021]
We propose Data Advisor, a method for generating data that takes into account the characteristics of the desired dataset.
Data Advisor monitors the status of the generated data, identifies weaknesses in the current dataset, and advises the next iteration of data generation.
arXiv Detail & Related papers (2024-10-07T17:59:58Z) - OpenDataLab: Empowering General Artificial Intelligence with Open Datasets [53.22840149601411]
This paper introduces OpenDataLab, a platform designed to bridge the gap between diverse data sources and the need for unified data processing.
OpenDataLab integrates a wide range of open-source AI datasets and enhances data acquisition efficiency through intelligent querying and high-speed downloading services.
We anticipate that OpenDataLab will significantly boost artificial general intelligence (AGI) research and facilitate advancements in related AI fields.
arXiv Detail & Related papers (2024-06-04T10:42:01Z) - Architectural Design Decisions for Self-Serve Data Platforms in Data
Meshes [3.627365672061558]
Data mesh is an emerging decentralized approach to managing and generating value from analytical enterprise data at scale.
It shifts the ownership of the data to the business domains closest to the data, promotes sharing and managing data as autonomous products, and uses a federated and automated data governance model.
The data mesh relies on a managed data platform that offers services to domain and governance teams to build, share, and manage data products efficiently.
arXiv Detail & Related papers (2024-02-07T09:13:26Z) - Architecting Data-Intensive Applications : From Data Architecture Design
to Its Quality Assurance [0.0]
Data Architecture is crucial in describing, collecting, storing, processing, and analyzing data to meet business needs.
We have evaluated the DAT on more than five cases within various industry domains, demonstrating its exceptional adaptability and effectiveness.
arXiv Detail & Related papers (2024-01-22T14:58:54Z) - Data Acquisition: A New Frontier in Data-centric AI [65.90972015426274]
We first present an investigation of current data marketplaces, revealing lack of platforms offering detailed information about datasets.
We then introduce the DAM challenge, a benchmark to model the interaction between the data providers and acquirers.
Our evaluation of the submitted strategies underlines the need for effective data acquisition strategies in Machine Learning.
arXiv Detail & Related papers (2023-11-22T22:15:17Z) - Modeling Data Analytics Architecture for Smart Cities Data-Driven
Applications using DAT [1.8945921149936187]
This article shares our experiences in developing a Data Analytics Architecture (DAA) using model-driven engineering for Data-Driven Smart Cities applications utilizing DAT.
DAA uses model-driven engineering for Data-Driven Smart Cities applications utilizing DAT.
arXiv Detail & Related papers (2023-07-17T21:52:57Z) - DAT: Data Architecture Modeling Tool for Data-Driven Applications [1.6037279419318131]
Data Architecture (DA) focuses on describing, collecting, storing, processing, and analyzing the data to meet business needs.
We present the DAT, a model-driven engineering tool enabling data architects, data engineers, and other stakeholders to describe how data flows through the system.
arXiv Detail & Related papers (2023-06-21T11:24:59Z) - A Comprehensive Survey of Dataset Distillation [73.15482472726555]
It has become challenging to handle the unlimited growth of data with limited computing power.
Deep learning technology has developed unprecedentedly in the last decade.
This paper provides a holistic understanding of dataset distillation from multiple aspects.
arXiv Detail & Related papers (2023-01-13T15:11:38Z) - A Big Data Lake for Multilevel Streaming Analytics [0.4640835690336652]
This paper focuses on storing high volume, velocity and variety data in the raw formats in a data storage architecture called a data lake.
We discuss and compare different open source and commercial platforms that can be used to develop a data lake.
We present a real-world data lake development use case for data stream ingestion, staging, and multilevel streaming analytics.
arXiv Detail & Related papers (2020-09-25T19:57:21Z) - Neural Data Server: A Large-Scale Search Engine for Transfer Learning
Data [78.74367441804183]
We introduce Neural Data Server (NDS), a large-scale search engine for finding the most useful transfer learning data to the target domain.
NDS consists of a dataserver which indexes several large popular image datasets, and aims to recommend data to a client.
We show the effectiveness of NDS in various transfer learning scenarios, demonstrating state-of-the-art performance on several target datasets.
arXiv Detail & Related papers (2020-01-09T01:21:30Z) - DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a
Trained Classifier [58.979104709647295]
We bridge the gap between the abundance of available data and lack of relevant data, for the future learning tasks of a trained network.
We use the available data, that may be an imbalanced subset of the original training dataset, or a related domain dataset, to retrieve representative samples.
We demonstrate that data from a related domain can be leveraged to achieve state-of-the-art performance.
arXiv Detail & Related papers (2019-12-27T02:05:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.