Grammar-based evolutionary approach for automated workflow composition
with domain-specific operators and ensemble diversity
- URL: http://arxiv.org/abs/2402.02124v1
- Date: Sat, 3 Feb 2024 11:29:14 GMT
- Title: Grammar-based evolutionary approach for automated workflow composition
with domain-specific operators and ensemble diversity
- Authors: Rafael Barbudo and Aurora Ram\'irez and Jos\'e Ra\'ul Romero
- Abstract summary: This paper introduces EvoFlow, a grammar-based evolutionary approach for automatic workflow composition (AWC)
EvoFlow enhances the flexibility in designing workflow structures, empowering practitioners to select algorithms that best fit their specific requirements.
Our findings show that EvoFlow's specialised genetic operators and updating mechanism substantially outperform current leading methods.
- Score: 0.36832029288386137
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The process of extracting valuable and novel insights from raw data involves
a series of complex steps. In the realm of Automated Machine Learning (AutoML),
a significant research focus is on automating aspects of this process,
specifically tasks like selecting algorithms and optimising their
hyper-parameters. A particularly challenging task in AutoML is automatic
workflow composition (AWC). AWC aims to identify the most effective sequence of
data preprocessing and ML algorithms, coupled with their best hyper-parameters,
for a specific dataset. However, existing AWC methods are limited in how many
and in what ways they can combine algorithms within a workflow.
Addressing this gap, this paper introduces EvoFlow, a grammar-based
evolutionary approach for AWC. EvoFlow enhances the flexibility in designing
workflow structures, empowering practitioners to select algorithms that best
fit their specific requirements. EvoFlow stands out by integrating two
innovative features. First, it employs a suite of genetic operators, designed
specifically for AWC, to optimise both the structure of workflows and their
hyper-parameters. Second, it implements a novel updating mechanism that
enriches the variety of predictions made by different workflows. Promoting this
diversity helps prevent the algorithm from overfitting. With this aim, EvoFlow
builds an ensemble whose workflows differ in their misclassified instances.
To evaluate EvoFlow's effectiveness, we carried out empirical validation
using a set of classification benchmarks. We begin with an ablation study to
demonstrate the enhanced performance attributable to EvoFlow's unique
components. Then, we compare EvoFlow with other AWC approaches, encompassing
both evolutionary and non-evolutionary techniques. Our findings show that
EvoFlow's specialised genetic operators and updating mechanism substantially
outperform current leading methods[..]
Related papers
- Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - AFlow: Automating Agentic Workflow Generation [36.61172223528231]
Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains.
We introduce AFlow, an automated framework that efficiently explores this space using Monte Carlo Tree Search.
Empirical evaluations across six benchmark datasets demonstrate AFlow's efficacy, yielding a 5.7% average improvement over state-of-the-art baselines.
arXiv Detail & Related papers (2024-10-14T17:40:40Z) - Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.
We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.
We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z) - Evolving machine learning workflows through interactive AutoML [0.36832029288386137]
We present ourmethod, an interactive G3P algorithm that allows users to prune the search space and focus on their regions of interest.
Our results confirm that the collaboration between ourmethod and humans allows us to find high-performance in terms of accuracy that require less tuning time than those found without human intervention.
arXiv Detail & Related papers (2024-02-28T17:34:21Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - Uni-Perceiver: Pre-training Unified Architecture for Generic Perception
for Zero-shot and Few-shot Tasks [73.63892022944198]
We present a generic perception architecture named Uni-Perceiver.
It processes a variety of modalities and tasks with unified modeling and shared parameters.
Results show that our pre-trained model without any tuning can achieve reasonable performance even on novel tasks.
arXiv Detail & Related papers (2021-12-02T18:59:50Z) - Improving RNA Secondary Structure Design using Deep Reinforcement
Learning [69.63971634605797]
We propose a new benchmark of applying reinforcement learning to RNA sequence design, in which the objective function is defined to be the free energy in the sequence's secondary structure.
We show results of the ablation analysis that we do for these algorithms, as well as graphs indicating the algorithm's performance across batches.
arXiv Detail & Related papers (2021-11-05T02:54:06Z) - Automated Evolutionary Approach for the Design of Composite Machine
Learning Pipelines [48.7576911714538]
The proposed approach is aimed to automate the design of composite machine learning pipelines.
It designs the pipelines with a customizable graph-based structure, analyzes the obtained results, and reproduces them.
The software implementation on this approach is presented as an open-source framework.
arXiv Detail & Related papers (2021-06-26T23:19:06Z) - Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical
Evolution [1.5224436211478214]
This paper describes a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution (DSGE) to the evolution of Scikit-Learn classification pipelines.
The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient ClassificationPipeline Evolution (RECIPE)
arXiv Detail & Related papers (2020-04-01T09:31:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.