An Empirical Study of Challenges in Machine Learning Asset Management
- URL: http://arxiv.org/abs/2402.15990v2
- Date: Wed, 28 Feb 2024 05:58:18 GMT
- Title: An Empirical Study of Challenges in Machine Learning Asset Management
- Authors: Zhimin Zhao, Yihao Chen, Abdul Ali Bangash, Bram Adams, Ahmed E.
Hassan
- Abstract summary: Despite existing research, a significant knowledge gap remains in operational challenges like model versioning, data traceability, and collaboration.
Our study aims to address this gap by analyzing 15,065 posts from developer forums and platforms.
We uncover 133 topics related to asset management challenges, grouped into 16 macro-topics, with software dependency, model deployment, and model training being the most discussed.
- Score: 15.07444988262748
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In machine learning (ML), efficient asset management, including ML models,
datasets, algorithms, and tools, is vital for resource optimization, consistent
performance, and a streamlined development lifecycle. This enables quicker
iterations, adaptability, reduced development-to-deployment time, and reliable
outputs. Despite existing research, a significant knowledge gap remains in
operational challenges like model versioning, data traceability, and
collaboration, which are crucial for the success of ML projects. Our study aims
to address this gap by analyzing 15,065 posts from developer forums and
platforms, employing a mixed-method approach to classify inquiries, extract
challenges using BERTopic, and identify solutions through open card sorting and
BERTopic clustering. We uncover 133 topics related to asset management
challenges, grouped into 16 macro-topics, with software dependency, model
deployment, and model training being the most discussed. We also find 79
solution topics, categorized under 18 macro-topics, highlighting software
dependency, feature development, and file management as key solutions. This
research underscores the need for further exploration of identified pain points
and the importance of collaborative efforts across academia, industry, and the
research community.
Related papers
- BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games [44.16513620589459]
We introduce BALROG, a novel benchmark to assess the agentic capabilities of Large Language Models (LLMs) and Vision Language Models (VLMs)
Our benchmark incorporates a range of existing reinforcement learning environments with varying levels of difficulty, including tasks that are solvable by non-expert humans in seconds to extremely challenging ones that may take years to master.
Our findings indicate that while current models achieve partial success in the easier games, they struggle significantly with more challenging tasks.
arXiv Detail & Related papers (2024-11-20T18:54:32Z) - An Empirical Investigation on the Challenges in Scientific Workflow Systems Development [2.704899832646869]
This study examines interactions between developers and researchers on Stack Overflow (SO) and GitHub.
By analyzing issues, we identified 13 topics (e.g., Errors and Bug Fixing, Documentation, Dependencies) and discovered that data structures and operations is the most difficult.
We also found common topics between SO and GitHub, such as data structures and operations, task management, and workflow scheduling.
arXiv Detail & Related papers (2024-11-16T21:14:11Z) - DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery.
Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering.
Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - Exploring Data Management Challenges and Solutions in Agile Software Development: A Literature Review and Practitioner Survey [4.45543024542181]
Managing data related to a software product and its development poses significant challenges for software projects and agile development teams.
Challenges include integrating data from diverse sources and ensuring data quality in light of continuous change and adaptation.
arXiv Detail & Related papers (2024-02-01T10:07:12Z) - The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains.
This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z) - GLUECons: A Generic Benchmark for Learning Under Constraints [102.78051169725455]
In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision.
We model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints.
arXiv Detail & Related papers (2023-02-16T16:45:36Z) - NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision
Research [96.53307645791179]
We introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks.
Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth.
Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks.
arXiv Detail & Related papers (2022-11-15T18:57:46Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.