Scaling Data Science Solutions with Semantics and Machine Learning:
Bosch Case
- URL: http://arxiv.org/abs/2308.01094v1
- Date: Wed, 2 Aug 2023 11:58:30 GMT
- Title: Scaling Data Science Solutions with Semantics and Machine Learning:
Bosch Case
- Authors: Baifan Zhou, Nikolay Nikolov, Zhuoxun Zheng, Xianghui Luo, Ognjen
Savkovic, Dumitru Roman, Ahmet Soylu, Evgeny Kharlamov
- Abstract summary: SemCloud is a semantics-enhanced cloud system with semantic technologies and machine learning.
The system has been evaluated in industrial use case with millions of data, thousands of repeated runs, and domain users, showing promising results.
- Score: 8.445414390004636
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Industry 4.0 and Internet of Things (IoT) technologies unlock unprecedented
amount of data from factory production, posing big data challenges in volume
and variety. In that context, distributed computing solutions such as cloud
systems are leveraged to parallelise the data processing and reduce computation
time. As the cloud systems become increasingly popular, there is increased
demand that more users that were originally not cloud experts (such as data
scientists, domain experts) deploy their solutions on the cloud systems.
However, it is non-trivial to address both the high demand for cloud system
users and the excessive time required to train them. To this end, we propose
SemCloud, a semantics-enhanced cloud system, that couples cloud system with
semantic technologies and machine learning. SemCloud relies on domain
ontologies and mappings for data integration, and parallelises the semantic
data integration and data analysis on distributed computing nodes. Furthermore,
SemCloud adopts adaptive Datalog rules and machine learning for automated
resource configuration, allowing non-cloud experts to use the cloud system. The
system has been evaluated in industrial use case with millions of data,
thousands of repeated runs, and domain users, showing promising results.
Related papers
- Integrating Homomorphic Encryption and Trusted Execution Technology for
Autonomous and Confidential Model Refining in Cloud [4.21388107490327]
Homomorphic encryption and trusted execution environment technology can protect confidentiality for autonomous computation.
We propose to integrate these two techniques in the design of the model refining scheme.
arXiv Detail & Related papers (2023-08-02T06:31:41Z) - CWD: A Machine Learning based Approach to Detect Unknown Cloud Workloads [3.523208537466129]
We develop a machine learning based technique to characterize, profile and predict workloads running in the cloud environment.
We also develop techniques to analyze the performance of the model in a standalone manner.
arXiv Detail & Related papers (2022-11-28T19:41:56Z) - Kubric: A scalable dataset generator [73.78485189435729]
Kubric is a Python framework that interfaces with PyBullet and Blender to generate photo-realistic scenes, with rich annotations, and seamlessly scales to large jobs distributed over thousands of machines.
We demonstrate the effectiveness of Kubric by presenting a series of 13 different generated datasets for tasks ranging from studying 3D NeRF models to optical flow estimation.
arXiv Detail & Related papers (2022-03-07T18:13:59Z) - Unsupervised Point Cloud Representation Learning with Deep Neural
Networks: A Survey [104.71816962689296]
Unsupervised point cloud representation learning has attracted increasing attention due to the constraint in large-scale point cloud labelling.
This paper provides a comprehensive review of unsupervised point cloud representation learning using deep neural networks.
arXiv Detail & Related papers (2022-02-28T07:46:05Z) - Edge-Cloud Polarization and Collaboration: A Comprehensive Survey [61.05059817550049]
We conduct a systematic review for both cloud and edge AI.
We are the first to set up the collaborative learning mechanism for cloud and edge modeling.
We discuss potentials and practical experiences of some on-going advanced edge AI topics.
arXiv Detail & Related papers (2021-11-11T05:58:23Z) - Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z) - Cloud Computing Concept and Roots [0.0]
Cloud computing is a particular implementation of distributed computing.
It inherited many properties of distributed computing such as scalability, reliability and distribution transparency.
New processing and storage resources can be added into the Cloud resource pool seamlessly.
arXiv Detail & Related papers (2021-01-28T17:42:46Z) - Sampling Training Data for Continual Learning Between Robots and the
Cloud [26.116999231118793]
We introduce HarvestNet, an intelligent sampling algorithm that resides on-board a robot and reduces system bottlenecks.
It significantly improves the accuracy of machine-learning models on our novel dataset of road construction sites, field testing of self-driving cars, and streaming face recognition.
It is between 1.05-2.58x more accurate than baseline algorithms and scalably runs on embedded deep learning hardware.
arXiv Detail & Related papers (2020-12-12T05:52:33Z) - Synthetic Data: Opening the data floodgates to enable faster, more
directed development of machine learning methods [96.92041573661407]
Many ground-breaking advancements in machine learning can be attributed to the availability of a large volume of rich data.
Many large-scale datasets are highly sensitive, such as healthcare data, and are not widely available to the machine learning community.
Generating synthetic data with privacy guarantees provides one such solution.
arXiv Detail & Related papers (2020-12-08T17:26:10Z) - Anomaly Detection in a Large-scale Cloud Platform [9.283888139549067]
Cloud computing is ubiquitous: more and more companies are moving the workloads into the Cloud.
Service providers need to monitor the quality of their ever-growing offerings effectively.
We designed and implemented an automated monitoring system for the IBM Cloud Platform.
arXiv Detail & Related papers (2020-10-21T12:58:36Z) - A Privacy-Preserving Distributed Architecture for
Deep-Learning-as-a-Service [68.84245063902908]
This paper introduces a novel distributed architecture for deep-learning-as-a-service.
It is able to preserve the user sensitive data while providing Cloud-based machine and deep learning services.
arXiv Detail & Related papers (2020-03-30T15:12:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.