MLExchange: A web-based platform enabling exchangeable machine learning
workflows
- URL: http://arxiv.org/abs/2208.09751v2
- Date: Tue, 23 Aug 2022 07:14:51 GMT
- Title: MLExchange: A web-based platform enabling exchangeable machine learning
workflows
- Authors: Zhuowen Zhao, Tanny Chavez, Elizabeth Holman, Guanhua Hao, Adam Green,
Harinarayan Krishnan, Dylan McReynolds, Ronald Pandolfi, Eric J. Roberts,
Petrus H. Zwart, Howard Yanxon, Nicholas Schwarz, Subramanian
Sankaranarayanan, Sergei V. Kalinin, Apurva Mehta, Stuart Campbel, Alexander
Hexemer
- Abstract summary: The MLExchange project aims to build a collaborative platform equipped with tools that allow scientists and facility users who do not have a profound ML background to use ML and computational resources in scientific discovery.
The whole platform or its individual service(s) can be easily deployed at servers of different scales, ranging from a laptop (usually a single user) to high performance clusters accessed (simultaneously) by many users.
- Score: 41.066688323596374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) algorithms are showing a growing trend in helping the
scientific communities across different disciplines and institutions to address
large and diverse data problems. However, many available ML tools are
programmatically demanding and computationally costly. The MLExchange project
aims to build a collaborative platform equipped with enabling tools that allow
scientists and facility users who do not have a profound ML background to use
ML and computational resources in scientific discovery. At the high level, we
are targeting a full user experience where managing and exchanging ML
algorithms, workflows, and data are readily available through web applications.
So far, we have built four major components, i.e, the central job manager, the
centralized content registry, user portal, and search engine, and successfully
deployed these components on a testing server.
Since each component is an independent container, the whole platform or its
individual service(s) can be easily deployed at servers of different scales,
ranging from a laptop (usually a single user) to high performance clusters
(HPC) accessed (simultaneously) by many users. Thus, MLExchange renders
flexible using scenarios -- users could either access the services and
resources from a remote server or run the whole platform or its individual
service(s) within their local network.
Related papers
- Towards Human-Guided, Data-Centric LLM Co-Pilots [53.35493881390917]
CliMB-DC is a human-guided, data-centric framework for machine learning co-pilots.
It combines advanced data-centric tools with LLM-driven reasoning to enable robust, context-aware data processing.
We show how CliMB-DC can transform uncurated datasets into ML-ready formats.
arXiv Detail & Related papers (2025-01-17T17:51:22Z) - Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering.
Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications.
These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z) - VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding? [115.60866817774641]
Multimodal Large Language models (MLLMs) have shown promise in web-related tasks.
evaluating their performance in the web domain remains a challenge due to the lack of comprehensive benchmarks.
bench is a multimodal benchmark designed to assess the capabilities of MLLMs across a variety of web tasks.
arXiv Detail & Related papers (2024-04-09T02:29:39Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z) - MLOps: A Step Forward to Enterprise Machine Learning [0.0]
This research presents a detailed review of MLOps, its benefits, difficulties, evolutions, and important underlying technologies.
The MLOps workflow is explained in detail along with the various tools necessary for both model and data exploration and deployment.
This article also puts light on the end-to-end production of ML projects using various maturity levels of automated pipelines.
arXiv Detail & Related papers (2023-05-27T20:44:14Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - Federated Learning and Meta Learning: Approaches, Applications, and
Directions [94.68423258028285]
In this tutorial, we present a comprehensive review of FL, meta learning, and federated meta learning (FedMeta)
Unlike other tutorial papers, our objective is to explore how FL, meta learning, and FedMeta methodologies can be designed, optimized, and evolved, and their applications over wireless networks.
arXiv Detail & Related papers (2022-10-24T10:59:29Z) - BPMN4sML: A BPMN Extension for Serverless Machine Learning. Technology
Independent and Interoperable Modeling of Machine Learning Workflows and
their Serverless Deployment Orchestration [0.0]
Machine learning (ML) continues to permeate all layers of academia, industry and society.
Business Process Model and Notation (BPMN) is widely accepted and applied.
BPMN is short of specific support to represent machine learning.
We introduce BPMN4sML (BPMN for serverless machine learning)
arXiv Detail & Related papers (2022-08-02T10:36:00Z) - Walle: An End-to-End, General-Purpose, and Large-Scale Production System
for Device-Cloud Collaborative Machine Learning [40.09527159285327]
We build the first end-to-end and general-purpose system, called Walle, for device-cloud collaborative machine learning (ML)
Walle consists of a deployment platform, distributing ML tasks to billion-scale devices in time; a data pipeline, efficiently preparing task input; and a compute container, providing a cross-platform and high-performance execution environment.
We evaluate Walle in practical e-commerce application scenarios to demonstrate its effectiveness, efficiency, and scalability.
arXiv Detail & Related papers (2022-05-30T03:43:35Z) - Widening Access to Applied Machine Learning with TinyML [1.1678513163359947]
We describe our pedagogical approach to increasing access to applied machine-learning (ML) through a massive open online course (MOOC) on Tiny Machine Learning (TinyML)
To this end, a collaboration between academia (Harvard University) and industry (Google) produced a four-part MOOC that provides application-oriented instruction on how to develop solutions using TinyML.
The series is openly available on the edX MOOC platform, has no prerequisites beyond basic programming, and is designed for learners from a global variety of backgrounds.
arXiv Detail & Related papers (2021-06-07T23:31:47Z) - MLModelCI: An Automatic Cloud Platform for Efficient MLaaS [15.029094196394862]
We release the platform as an open-source project on GitHub under Apache 2.0 license.
Our system bridges the gap between current ML training and serving systems and thus free developers from manual and tedious work often associated with service deployment.
arXiv Detail & Related papers (2020-06-09T07:48:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.