Large Language Models to the Rescue: Reducing the Complexity in
Scientific Workflow Development Using ChatGPT
- URL: http://arxiv.org/abs/2311.01825v2
- Date: Mon, 6 Nov 2023 11:43:33 GMT
- Title: Large Language Models to the Rescue: Reducing the Complexity in
Scientific Workflow Development Using ChatGPT
- Authors: Mario S\"anger, Ninon De Mecquenem, Katarzyna Ewa Lewi\'nska, Vasilis
Bountris, Fabian Lehmann, Ulf Leser, Thomas Kosch
- Abstract summary: Scientific systems are increasingly popular for expressing and executing complex data analysis pipelines over large datasets.
However, implementing is difficult due to the involvement of many blackbox tools and the deep infrastructure stack necessary for their execution.
We investigate the efficiency of Large Language Models, specifically ChatGPT, to support users when dealing with scientific domains.
- Score: 11.410608233274942
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scientific workflow systems are increasingly popular for expressing and
executing complex data analysis pipelines over large datasets, as they offer
reproducibility, dependability, and scalability of analyses by automatic
parallelization on large compute clusters. However, implementing workflows is
difficult due to the involvement of many black-box tools and the deep
infrastructure stack necessary for their execution. Simultaneously,
user-supporting tools are rare, and the number of available examples is much
lower than in classical programming languages. To address these challenges, we
investigate the efficiency of Large Language Models (LLMs), specifically
ChatGPT, to support users when dealing with scientific workflows. We performed
three user studies in two scientific domains to evaluate ChatGPT for
comprehending, adapting, and extending workflows. Our results indicate that
LLMs efficiently interpret workflows but achieve lower performance for
exchanging components or purposeful workflow extensions. We characterize their
limitations in these challenging scenarios and suggest future research
directions.
Related papers
- Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks [0.8425561594225592]
This study introduces a novel framework for training smaller language models in function calling.
It focuses on specific logical and mathematical reasoning tasks.
The approach aims to improve performances of small-scale models for these tasks using function calling.
arXiv Detail & Related papers (2024-10-24T16:27:35Z) - Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.
We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.
We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z) - Mixing It Up: The Cocktail Effect of Multi-Task Fine-Tuning on LLM Performance -- A Case Study in Finance [0.32985979395737774]
We study the application of large language models (LLMs) in domain-specific contexts, including finance.
We find that fine-tuning exclusively on the target task is not always the most effective strategy.
Instead, multi-task fine-tuning can significantly enhance performance.
arXiv Detail & Related papers (2024-10-01T22:35:56Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z) - Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models.
Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions.
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - Reusability Challenges of Scientific Workflows: A Case Study for Galaxy [56.78572674167333]
This study examined the reusability of existing and exposed several challenges.
The challenges preventing reusability include tool upgrading, tool support, design flaws, incomplete, failure to load a workflow, etc.
arXiv Detail & Related papers (2023-09-13T20:17:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.