Metadata Representations for Queryable ML Model Zoos
- URL: http://arxiv.org/abs/2207.09315v1
- Date: Tue, 19 Jul 2022 15:04:14 GMT
- Title: Metadata Representations for Queryable ML Model Zoos
- Authors: Ziyu Li, Rihan Hai, Alessandro Bozzon and Asterios Katsifodimos
- Abstract summary: Machine learning (ML) practitioners and organizations are building model zoos of pre-trained models, containing metadata describing properties of the models.
The metatada is currently not standardised; its expressivity is limited; and there is no way to store and query it.
In this paper, we advocate for standardized ML model meta-data representation and management, proposing a toolkit supported to help practitioners manage and query that metadata.
- Score: 73.24799582702326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) practitioners and organizations are building model zoos
of pre-trained models, containing metadata describing properties of the ML
models and datasets that are useful for reporting, auditing, reproducibility,
and interpretability purposes. The metatada is currently not standardised; its
expressivity is limited; and there is no interoperable way to store and query
it. Consequently, model search, reuse, comparison, and composition are
hindered. In this paper, we advocate for standardized ML model meta-data
representation and management, proposing a toolkit supported to help
practitioners manage and query that metadata.
Related papers
- Augmented Knowledge Graph Querying leveraging LLMs [2.5311562666866494]
We introduce SparqLLM, a framework that enhances the querying of Knowledge Graphs (KGs)
SparqLLM executes the Extract, Transform, and Load (ETL) pipeline to construct KGs from raw data.
It also features a natural language interface powered by Large Language Models (LLMs) to enable automatic SPARQL query generation.
arXiv Detail & Related papers (2025-02-03T12:18:39Z) - Harmonizing Metadata of Language Resources for Enhanced Querying and Accessibility [0.0]
This paper addresses the harmonization of metadata from diverse repositories of language resources (LRs)
Our methodology supports text-based search, faceted browsing, and advanced SPARQL queries through Linghub, a newly developed portal.
The study highlights significant metadata issues and advocates for adherence to open vocabularies and standards to enhance metadata harmonization.
arXiv Detail & Related papers (2025-01-09T22:48:43Z) - Towards Agentic Schema Refinement [3.7173623393215287]
We propose a semantic layer in-between the database and the user as a set of small and easy-to-interpret database views.
Our approach paves the way for LLM-powered exploration of unwieldy databases.
arXiv Detail & Related papers (2024-11-25T19:57:16Z) - Matchmaker: Self-Improving Large Language Model Programs for Schema Matching [60.23571456538149]
We propose a compositional language model program for schema matching, comprised of candidate generation, refinement and confidence scoring.
Matchmaker self-improves in a zero-shot manner without the need for labeled demonstrations.
Empirically, we demonstrate on real-world medical schema matching benchmarks that Matchmaker outperforms previous ML-based approaches.
arXiv Detail & Related papers (2024-10-31T16:34:03Z) - GSAP-NER: A Novel Task, Corpus, and Baseline for Scholarly Entity
Extraction Focused on Machine Learning Models and Datasets [3.9169112083667073]
In academic writing, references to machine learning models and datasets are fundamental components.
Existing ground truth datasets do not treat fine-grained types like ML model and model architecture as separate entity types.
We release a corpus of 100 manually annotated full-text scientific publications and a first baseline model for 10 entity types centered around ML models and datasets.
arXiv Detail & Related papers (2023-11-16T12:43:02Z) - Adapting Large Language Models for Content Moderation: Pitfalls in Data
Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains.
In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z) - DAC-MR: Data Augmentation Consistency Based Meta-Regularization for
Meta-Learning [55.733193075728096]
We propose a meta-knowledge informed meta-learning (MKIML) framework to improve meta-learning.
We preliminarily integrate meta-knowledge into meta-objective via using an appropriate meta-regularization (MR) objective.
The proposed DAC-MR is hopeful to learn well-performing meta-models from training tasks with noisy, sparse or unavailable meta-data.
arXiv Detail & Related papers (2023-05-13T11:01:47Z) - Improving Meta-learning for Low-resource Text Classification and
Generation via Memory Imitation [87.98063273826702]
We propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation.
A theoretical analysis is provided to prove the effectiveness of our method.
arXiv Detail & Related papers (2022-03-22T12:41:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.