Related papers: Chameleon: A Semi-AutoML framework targeting quick and scalable development and deployment of production-ready ML systems for SMEs

Chameleon: A Semi-AutoML framework targeting quick and scalable development and deployment of production-ready ML systems for SMEs

URL: http://arxiv.org/abs/2105.03669v1
Date: Sat, 8 May 2021 10:43:26 GMT
Title: Chameleon: A Semi-AutoML framework targeting quick and scalable development and deployment of production-ready ML systems for SMEs
Authors: Johannes Otterbach, Thomas Wollmann
Abstract summary: We discuss the implementation and concepts of Chameleon, a semi-AutoML framework. The goal of Chameleon is fast and scalable development and deployment of production-ready machine learning systems into the workflow of SMEs.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Developing, scaling, and deploying modern Machine Learning solutions remains challenging for small- and middle-sized enterprises (SMEs). This is due to a high entry barrier of building and maintaining a dedicated IT team as well as the difficulties of real-world data (RWD) compared to standard benchmark data. To address this challenge, we discuss the implementation and concepts of Chameleon, a semi-AutoML framework. The goal of Chameleon is fast and scalable development and deployment of production-ready machine learning systems into the workflow of SMEs. We first discuss the RWD challenges faced by SMEs. After, we outline the central part of the framework which is a model and loss-function zoo with RWD-relevant defaults. Subsequently, we present how one can use a templatable framework in order to automate the experiment iteration cycle, as well as close the gap between development and deployment. Finally, we touch on our testing framework component allowing us to investigate common model failure modes and support best practices of model deployment governance.

Related papers

Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models. Our framework incorporates two complementary strategies: internal TTC and external TTC. We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
Consolidating TinyML Lifecycle with Large Language Models: Reality, Illusion, or Opportunity? [3.1471494780647795]
This paper explores whether Large Language Models (LLMs) could help automate and streamline the TinyML lifecycle. We develop a framework that leverages the natural language processing (NLP) and code generation capabilities of LLMs to reduce development time and lower the barriers to entry for TinyML deployment.
arXiv Detail & Related papers (2025-01-20T22:20:57Z)
Towards Human-Guided, Data-Centric LLM Co-Pilots [53.35493881390917]
CliMB-DC is a human-guided, data-centric framework for machine learning co-pilots. It combines advanced data-centric tools with LLM-driven reasoning to enable robust, context-aware data processing. We show how CliMB-DC can transform uncurated datasets into ML-ready formats.
arXiv Detail & Related papers (2025-01-17T17:51:22Z)
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design [59.00758127310582]
We propose a novel framework Read-ME that transforms pre-trained dense LLMs into smaller MoE models. Our approach employs activation sparsity to extract experts. Read-ME outperforms other popular open-source dense models of similar scales.
arXiv Detail & Related papers (2024-10-24T19:48:51Z)
LLM as a code generator in Agile Model Driven Development [1.12646803578849]
This research champions Model Driven Development (MDD) as a viable strategy to overcome these challenges. We propose an Agile Model Driven Development (AMDD) approach that employs GPT4 as a code generator. Applying GPT4 auto generation capabilities yields Java and Python code that is compatible with the JADE and PADE frameworks.
arXiv Detail & Related papers (2024-10-24T07:24:11Z)
On-Device LLMs for SMEs: Challenges and Opportunities [16.335180583743885]
This paper focuses on the infrastructure requirements for deploying Large Language Models (LLMs) on-device within the context of small and medium-sized enterprises (SMEs) From the hardware viewpoint, we discuss the utilization of processing units like GPUs and TPUs, efficient memory and storage solutions, and strategies for effective deployment. From the software perspective, we explore framework compatibility, operating system optimization, and the use of specialized libraries tailored for resource-constrained environments.
arXiv Detail & Related papers (2024-10-21T14:48:35Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Hierarchical and Decoupled BEV Perception Learning Framework for Autonomous Driving [52.808273563372126]
This paper proposes a novel hierarchical BEV perception paradigm, aiming to provide a library of fundamental perception modules and user-friendly graphical interface. We conduct the Pretrain-Finetune strategy to effectively utilize large scale public datasets and streamline development processes. We also present a Multi-Module Learning (MML) approach, enhancing performance through synergistic and iterative training of multiple models.
arXiv Detail & Related papers (2024-07-17T11:17:20Z)
Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models [106.94827590977337]
We propose a novel world model for Multi-Agent RL (MARL) that learns decentralized local dynamics for scalability. We also introduce a Perceiver Transformer as an effective solution to enable centralized representation aggregation. Results on Starcraft Multi-Agent Challenge (SMAC) show that it outperforms strong model-free approaches and existing model-based methods in both sample efficiency and overall performance.
arXiv Detail & Related papers (2024-06-22T12:40:03Z)
Emerging Platforms Meet Emerging LLMs: A Year-Long Journey of Top-Down Development [20.873143073842705]
We introduce TapML, a top-down approach and tooling designed to streamline the deployment of machine learning systems on diverse platforms. Unlike traditional bottom-up methods, TapML automates unit testing and adopts a migration-based strategy for gradually offloading model computations. TapML was developed and applied through a year-long, real-world effort that successfully deployed significant emerging models and platforms.
arXiv Detail & Related papers (2024-04-14T06:09:35Z)
Foundation Models to the Rescue: Deadlock Resolution in Connected Multi-Robot Systems [11.012092202226855]
Connected multi-agent robotic systems (MRS) are prone to deadlocks in an obstacle environment. This paper explores the possibility of using text-based models, i.e., large language models (LLMs), and text-and-image-based models (VLMs), as high-level planners for deadlock resolution. We propose a hierarchical control framework where a foundation model-based high-level planner helps to resolve deadlocks by assigning a leader to the MRS along with a set of waypoints for the MRS leader.
arXiv Detail & Related papers (2024-04-09T16:03:26Z)
Machine Learning Insides OptVerse AI Solver: Design Principles and Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver. We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem. We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z)
Model Share AI: An Integrated Toolkit for Collaborative Machine Learning Model Development, Provenance Tracking, and Deployment in Python [0.0]
We introduce Model Share AI (AIMS), an easy-to-use MLOps platform designed to streamline collaborative model development, model provenance tracking, and model deployment. AIMS features collaborative project spaces and a standardized model evaluation process that ranks model submissions based on their performance on unseen evaluation data. AIMS allows users to deploy ML models built in Scikit-Learn, Keras, PyTorch, and ONNX into live REST APIs and automatically generated web apps.
arXiv Detail & Related papers (2023-09-27T15:24:39Z)
Real-time Neural-MPC: Deep Learning Model Predictive Control for Quadrotors and Agile Robotic Platforms [59.03426963238452]
We present Real-time Neural MPC, a framework to efficiently integrate large, complex neural network architectures as dynamics models within a model-predictive control pipeline. We show the feasibility of our framework on real-world problems by reducing the positional tracking error by up to 82% when compared to state-of-the-art MPC approaches without neural network dynamics.
arXiv Detail & Related papers (2022-03-15T09:38:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.