Related papers: Compendium Manager: a tool for coordination of workflow management instances for bulk data processing in Python

Related papers

What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities [56.646832992178105]
We introduce OmniBench, a cross-platform, graph-based benchmark with an automated pipeline for synthesizing tasks of controllable complexity.<n>We present OmniEval, a multidimensional evaluation framework that includes subtask-level evaluation, graph-based metrics, and comprehensive tests across 10 capabilities.<n>Our dataset contains 36k graph-structured tasks across 20 scenarios, achieving a 91% human acceptance rate.
arXiv Detail & Related papers (2025-06-10T15:59:38Z)
Repo2Run: Automated Building Executable Environment for Code Repository at Scale [8.795746370609855]
We introduce Repo2Run, an agent aiming at automating the building of executable test environments for any repositories at scale.<n>Repo2Run iteratively builds the Docker image, runs unit tests based on the feedback of the building, and synthesizes the Dockerfile.<n>The resulting Dockerfile can then be used to create Docker container environments for running code and tests.
arXiv Detail & Related papers (2025-02-19T12:51:35Z)
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering. Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications. These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z)
Instrumentation and Analysis of Native ML Pipelines via Logical Query Plans [3.2362171533623054]
We envision highly-automated software platforms to assist data scientists with developing, validating, monitoring, and analysing their Machine Learning pipelines. We extract "logical query plans" from ML pipeline code relying on popular libraries. Based on these plans, we automatically infer pipeline semantics and instrument and rewrite the ML pipelines to enable diverse use cases without requiring data scientists to manually annotate or rewrite their code.
arXiv Detail & Related papers (2024-07-10T11:35:02Z)
A Survey of Pipeline Tools for Data Engineering [1.4856472820492366]
A variety of pipeline tools are available for use in data engineering. This survey examines the broad categories and examples of pipeline tools based on their design and data engineering intentions. Case studies are presented indicating the usage of pipeline tools for data engineering.
arXiv Detail & Related papers (2024-06-12T15:41:06Z)
TaskBench: Benchmarking Large Language Models for Task Automation [82.2932794189585]
We introduce TaskBench, a framework to evaluate the capability of large language models (LLMs) in task automation. Specifically, task decomposition, tool selection, and parameter prediction are assessed. Our approach combines automated construction with rigorous human verification, ensuring high consistency with human evaluation.
arXiv Detail & Related papers (2023-11-30T18:02:44Z)
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs [104.37772295581088]
Open-source large language models (LLMs), e.g., LLaMA, remain significantly limited in tool-use capabilities. We introduce ToolLLM, a general tool-usetuning encompassing data construction, model training, and evaluation. We first present ToolBench, an instruction-tuning framework for tool use, which is constructed automatically using ChatGPT.
arXiv Detail & Related papers (2023-07-31T15:56:53Z)
Interactive Visualization of Protein RINs using NetworKit in the Cloud [57.780880387925954]
In this paper, we consider an example from protein dynamics, specifically residue interaction networks (RINs) We use NetworKit to build a cloud-based environment that enables domain scientists to run their visualization and analysis on large compute servers. To demonstrate the versatility of this approach, we use it to build a custom Jupyter-based widget for RIN visualization.
arXiv Detail & Related papers (2022-03-02T17:41:45Z)
SapientML: Synthesizing Machine Learning Pipelines by Learning from Human-Written Solutions [28.718446733713183]
We propose an AutoML SapientML that can learn from a corpus of existing datasets and their human-written pipelines. We have created a training corpus of 1094 pipelines spanning 170 datasets, and evaluated SapientML on a set of 41 benchmark datasets. Our evaluation shows that SapientML produces the best or comparable accuracy on 27 of the benchmarks while the second best tool fails to even produce a pipeline on 9 of the instances.
arXiv Detail & Related papers (2022-02-18T20:45:47Z)
Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans [103.92680099373567]
This paper introduces a pipeline to parametrically sample and render multi-task vision datasets from comprehensive 3D scans from the real world. Changing the sampling parameters allows one to "steer" the generated datasets to emphasize specific information. Common architectures trained on a generated starter dataset reached state-of-the-art performance on multiple common vision tasks and benchmarks.
arXiv Detail & Related papers (2021-10-11T04:21:46Z)
AutoPipeline: Synthesize Data Pipelines By-Target Using Reinforcement Learning and Search [19.53147565613595]
We propose to automate complex data pipelines with both string transformations and table-manipulation operators. We propose a novel "by-target" paradigm that allows users to easily specify the desired pipeline. We develop an Auto-Pipeline system that learns to synthesize pipelines using reinforcement learning and search.
arXiv Detail & Related papers (2021-06-25T19:44:01Z)
MLCask: Efficient Management of Component Evolution in Collaborative Data Analytics Pipelines [29.999324319722508]
We address two main challenges that arise during the deployment of machine learning pipelines, and address them with the design of versioning for an end-to-end analytics system MLCask. We define and accelerate the metric-driven merge operation by pruning the pipeline search tree using reusable history records and pipeline compatibility information. The effectiveness of MLCask is evaluated through an extensive study over several real-world deployment cases.
arXiv Detail & Related papers (2020-10-17T13:34:48Z)
TODS: An Automated Time Series Outlier Detection System [70.88663649631857]
TODS is a highly modular system that supports easy pipeline construction.<n>Tods supports 70 primitives, including data processing, time series processing, feature analysis, detection algorithms, and a reinforcement module.
arXiv Detail & Related papers (2020-09-18T15:36:43Z)
PyODDS: An End-to-end Outlier Detection System with Automated Machine Learning [55.32009000204512]
We present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support. Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space. It also provides unified interfaces and visualizations for users with or without data science or machine learning background.
arXiv Detail & Related papers (2020-03-12T03:30:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.