ZeroML: A Next Generation AutoML Language
- URL: http://arxiv.org/abs/2505.18243v1
- Date: Fri, 23 May 2025 16:01:49 GMT
- Title: ZeroML: A Next Generation AutoML Language
- Authors: Monirul Islam Mahmud,
- Abstract summary: ZeroML is a new generation programming language for AutoML to drive the ML pipeline in a compiled and multi-paradigm way.<n>ZeroML brings the Microservices-based architecture adding the modular, reusable pieces such as DataCleaner, FeatureEngineer or ModelSelector.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: ZeroML is a new generation programming language for AutoML to drive the ML pipeline in a compiled and multi-paradigm way, with a pure functional core. Meeting the shortcomings introduced by Python, R, or Julia such as slow-running time, brittle pipelines or high dependency cost ZeroML brings the Microservices-based architecture adding the modular, reusable pieces such as DataCleaner, FeatureEngineer or ModelSelector. As a native multithread and memory-aware search optimized toolkit, and with one command deployability ability, ZeroML ensures non-coders and ML professionals to create high-accuracy models super fast and in a more reproducible way. The verbosity of the language ensures that when it comes to dropping into the backend, the code we will be creating is extremely clear but the level of repetition and boilerplate required when developing on the front end is now removed.
Related papers
- ML2B: Multi-Lingual ML Benchmark For AutoML [0.0]
We present ML2B, the first benchmark for evaluating multilingual machine learning code generation.<n>For evaluation, we employ AIDE, an automated framework for end-to-end assessment of data science pipelines.<n>Results reveal substantial 15-45% performance degradation on non-English tasks.
arXiv Detail & Related papers (2025-09-26T17:20:27Z) - Type-Constrained Code Generation with Language Models [51.03439021895432]
We introduce a type-constrained decoding approach that leverages type systems to guide code generation.<n>For this purpose, we develop novel prefix automata and a search over inhabitable types, forming a sound approach to enforce well-typedness on LLM-generated code.<n>Our approach reduces compilation errors by more than half and significantly increases functional correctness in code synthesis, translation, and repair tasks.
arXiv Detail & Related papers (2025-04-12T15:03:00Z) - Langformers: Unified NLP Pipelines for Language Models [3.690904966341072]
Langformers is an open-source Python library designed to streamline NLP pipelines.<n>It integrates conversational AI, pretraining, text classification, sentence embedding/reranking, data labelling, semantic search, and knowledge distillation into a cohesive API.
arXiv Detail & Related papers (2025-04-12T10:17:49Z) - AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.<n>Recent works have started exploiting large language models (LLM) to lessen such burden.<n>This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML)<n>VML constrains the parameter space to be human-interpretable natural language.<n>We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z) - Meaning-Typed Programming: Language Abstraction and Runtime for Model-Integrated Applications [8.007302441327214]
This paper presents Meaning-Typed Programming (MTP) model, a novel paradigm that abstracts large language models (LLMs) integration through intuitive language-level constructs.<n>We implement MTP in Jac, a Python superset language, and find that MTP significantly reduces coding complexity while maintaining accuracy and efficiency.<n>For math problems from the GSM8k dataset, MTP achieves accuracy rates approaching 90%, while reducing token usage in 10 out of 13 benchmarks.
arXiv Detail & Related papers (2024-05-14T21:12:01Z) - Large Language Models Synergize with Automated Machine Learning [12.364087286739647]
This paper explores a novel form of program synthesis, targeting machine learning (ML) programs, by combining large language models (LLMs) and automated machine learning (autoML)
In experiments, given the textual task description, our method, Text-to-ML, generates the complete and optimized ML program in a fully autonomous process.
arXiv Detail & Related papers (2024-05-06T08:09:46Z) - ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks.
To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z) - VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space
Decomposition [57.06900573003609]
VolcanoML is a framework that decomposes a large AutoML search space into smaller ones.
It supports a Volcano-style execution model, akin to the one supported by modern database systems.
Our evaluation demonstrates that, not only does VolcanoML raise the level of expressiveness for search space decomposition in AutoML, it also leads to actual findings of decomposition strategies.
arXiv Detail & Related papers (2021-07-19T13:23:57Z) - From Things' Modeling Language (ThingML) to Things' Machine Learning
(ThingML2) [4.014524824655106]
We enhance ThingML to support machine learning on the modeling level.
Our DSL allows one to define things, which are in charge of carrying out data analytics.
Our code generators can automatically produce the complete implementation in Java and Python.
arXiv Detail & Related papers (2020-09-22T15:44:57Z) - DriveML: An R Package for Driverless Machine Learning [7.004573941239386]
DriveML helps in implementing some of the pillars of an automated machine learning pipeline.
The main benefits of DriveML are in development time savings, reduce developer's errors, optimal tuning of machine learning models and errors.
arXiv Detail & Related papers (2020-05-01T16:40:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.