Integrating pre-processing pipelines in ODC based framework
- URL: http://arxiv.org/abs/2210.01528v1
- Date: Tue, 4 Oct 2022 11:12:09 GMT
- Title: Integrating pre-processing pipelines in ODC based framework
- Authors: U.Otamendi (1), I.Azpiroz (1), M.Quartulli (1), I.Olaizola (1) ((1)
Vicomtech Foundation)
- Abstract summary: This paper proposes a method to integrate virtual products based on integrating open-source processing pipelines.
In order to validate and evaluate the functioning of this approach, we have integrated it into a geo-imagery management framework.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Using on-demand processing pipelines to generate virtual geospatial products
is beneficial to optimizing resource management and decreasing processing
requirements and data storage space. Additionally, pre-processed products
improve data quality for data-driven analytical algorithms, such as machine
learning or deep learning models. This paper proposes a method to integrate
virtual products based on integrating open-source processing pipelines. In
order to validate and evaluate the functioning of this approach, we have
integrated it into a geo-imagery management framework based on Open Data Cube
(ODC). To validate the methodology, we have performed three experiments
developing on-demand processing pipelines using multi-sensor remote sensing
data, for instance, Sentinel-1 and Sentinel-2. These pipelines are integrated
using open-source processing frameworks.
Related papers
- Control and Automation for Industrial Production Storage Zone: Generation of Optimal Route Using Image Processing [49.1574468325115]
This article focuses on developing an industrial automation method for a zone of a production line model using the DIP.
The neo-cascade methodology employed allowed for defining each of the stages in an adequate way, ensuring the inclusion of the relevant methods for its development.
The system was based on the OpenCV library; tool focused on artificial vision, which was implemented on an object-oriented programming (OOP) platform based on Java language.
arXiv Detail & Related papers (2024-03-15T06:50:19Z) - An Integrated Data Processing Framework for Pretraining Foundation Models [57.47845148721817]
Researchers and practitioners often have to manually curate datasets from difference sources.
We propose a data processing framework that integrates a Processing Module and an Analyzing Module.
The proposed framework is easy to use and highly flexible.
arXiv Detail & Related papers (2024-02-26T07:22:51Z) - Trusted Provenance of Automated, Collaborative and Adaptive Data Processing Pipelines [2.186901738997927]
We provide a solution architecture and a proof of concept implementation of a service, called Provenance Holder.
Provenance Holder enables provenance of collaborative, adaptive data processing pipelines in a trusted manner.
arXiv Detail & Related papers (2023-10-17T17:52:27Z) - Deep Pipeline Embeddings for AutoML [11.168121941015015]
AutoML is a promising direction for democratizing AI by automatically deploying Machine Learning systems with minimal human expertise.
Existing Pipeline Optimization techniques fail to explore deep interactions between pipeline stages/components.
This paper proposes a novel neural architecture that captures the deep interaction between the components of a Machine Learning pipeline.
arXiv Detail & Related papers (2023-05-23T12:40:38Z) - Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data
Programming [77.38174112525168]
We present Nemo, an end-to-end interactive Supervision system that improves overall productivity of WS learning pipeline by an average 20% (and up to 47% in one task) compared to the prevailing WS supervision approach.
arXiv Detail & Related papers (2022-03-02T19:57:32Z) - Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning
Preprocessing Pipelines [77.45213180689952]
Preprocessing pipelines in deep learning aim to provide sufficient data throughput to keep the training processes busy.
We introduce a new perspective on efficiently preparing datasets for end-to-end deep learning pipelines.
We obtain an increased throughput of 3x to 13x compared to an untuned system.
arXiv Detail & Related papers (2022-02-17T14:31:58Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Automated Evolutionary Approach for the Design of Composite Machine
Learning Pipelines [48.7576911714538]
The proposed approach is aimed to automate the design of composite machine learning pipelines.
It designs the pipelines with a customizable graph-based structure, analyzes the obtained results, and reproduces them.
The software implementation on this approach is presented as an open-source framework.
arXiv Detail & Related papers (2021-06-26T23:19:06Z) - MLCask: Efficient Management of Component Evolution in Collaborative
Data Analytics Pipelines [29.999324319722508]
We address two main challenges that arise during the deployment of machine learning pipelines, and address them with the design of versioning for an end-to-end analytics system MLCask.
We define and accelerate the metric-driven merge operation by pruning the pipeline search tree using reusable history records and pipeline compatibility information.
The effectiveness of MLCask is evaluated through an extensive study over several real-world deployment cases.
arXiv Detail & Related papers (2020-10-17T13:34:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.