FAIRification of MLC data
- URL: http://arxiv.org/abs/2211.12757v1
- Date: Wed, 23 Nov 2022 07:53:17 GMT
- Title: FAIRification of MLC data
- Authors: Ana Kostovska, Jasmin Bogatinovski, Andrej Treven, Sa\v{s}o
D\v{z}eroski, Dragi Kocev, Pan\v{c}e Panov
- Abstract summary: We introduce an online catalogue of MLC datasets that follow the FAIR (Findable, Accessible, Interoperable, and Reusable) and TRUST (Transparency, Responsibility, User focus, Sustainability, and Technology) principles.
The catalogue extensively describes many MLC datasets with comprehensible meta-features, MLC-specific semantic descriptions, and different data provenance information.
- Score: 5.803041363561935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The multi-label classification (MLC) task has increasingly been receiving
interest from the machine learning (ML) community, as evidenced by the growing
number of papers and methods that appear in the literature. Hence, ensuring
proper, correct, robust, and trustworthy benchmarking is of utmost importance
for the further development of the field. We believe that this can be achieved
by adhering to the recently emerged data management standards, such as the FAIR
(Findable, Accessible, Interoperable, and Reusable) and TRUST (Transparency,
Responsibility, User focus, Sustainability, and Technology) principles. To
FAIRify the MLC datasets, we introduce an ontology-based online catalogue of
MLC datasets that follow these principles. The catalogue extensively describes
many MLC datasets with comprehensible meta-features, MLC-specific semantic
descriptions, and different data provenance information. The MLC data catalogue
is extensively described in our recent publication in Nature Scientific
Reports, Kostovska & Bogatinovski et al., and available at:
http://semantichub.ijs.si/MLCdatasets. In addition, we provide an
ontology-based system for easy access and querying of performance/benchmark
data obtained from a comprehensive MLC benchmark study. The system is available
at: http://semantichub.ijs.si/MLCbenchmark.
Related papers
- Exploring LLM Capabilities in Extracting DCAT-Compatible Metadata for Data Cataloging [0.1424853531377145]
Data catalogs can support and accelerate data exploration by using metadata to answer user queries.<n>This study investigates whether LLMs can automate metadata maintenance of text-based data and generate high-quality DCAT-compatible metadata.<n>Our results show that LLMs can generate metadata comparable to human-created content, particularly on tasks that require advanced semantic understanding.
arXiv Detail & Related papers (2025-07-04T10:49:37Z) - MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs [54.5729817345543]
MOLE is a framework that automatically extracts metadata attributes from scientific papers covering datasets of languages other than Arabic.<n>Our methodology processes entire documents across multiple input formats and incorporates robust validation mechanisms for consistent output.
arXiv Detail & Related papers (2025-05-26T10:31:26Z) - DocMMIR: A Framework for Document Multi-modal Information Retrieval [21.919132888183622]
We introduce DocMMIR, a novel multi-modal document retrieval framework.<n>We construct a large-scale cross-domain multimodal benchmark, comprising 450K samples.<n>Results show a +31% improvement in MRR@10 compared to the zero-shot baseline.
arXiv Detail & Related papers (2025-05-25T20:58:58Z) - Leveraging Retrieval Augmented Generative LLMs For Automated Metadata Description Generation to Enhance Data Catalogs [1.1957520154275776]
Data catalogs serve as repositories for organizing and accessing diverse collection of data assets.
Many data catalogs within organizations suffer from limited searchability due to inadequate metadata like asset descriptions.
This paper explores the challenges associated with metadata creation and proposes a unique prompt enrichment idea of leveraging existing metadata content.
arXiv Detail & Related papers (2025-03-12T02:33:33Z) - LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs -- No Silver Bullet for LC or RAG Routing [70.35888047551643]
We present LaRA, a novel benchmark specifically designed to rigorously compare RAG and LC LLMs.
LaRA encompasses 2326 test cases across four practical QA task categories and three types of naturally occurring long texts.
We find that the optimal choice between RAG and LC depends on a complex interplay of factors, including the model's parameter size, long-text capabilities, context length, task type, and the characteristics of the retrieved chunks.
arXiv Detail & Related papers (2025-02-14T08:04:22Z) - Recent Advances of Multimodal Continual Learning: A Comprehensive Survey [64.82070119713207]
We present the first comprehensive survey on multimodal continual learning methods.
We categorize existing MMCL methods into four categories, i.e., regularization-based, architecture-based, replay-based, and prompt-based.
We discuss several promising future directions for investigation and development.
arXiv Detail & Related papers (2024-10-07T13:10:40Z) - LLMJudge: LLMs for Relevance Judgments [37.103230004631996]
The challenge is organized as part of the LLM4Eval workshop at SIGIR 2024.
Recent studies have shown that LLMs can generate reliable relevance judgments for search systems.
The collected data will be released as a package to support automatic relevance judgment research.
arXiv Detail & Related papers (2024-08-09T23:15:41Z) - DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems [99.17123445211115]
We introduce DocBench, a benchmark to evaluate large language model (LLM)-based document reading systems.
Our benchmark involves the recruitment of human annotators and the generation of synthetic questions.
It includes 229 real documents and 1,102 questions, spanning across five different domains and four major types of questions.
arXiv Detail & Related papers (2024-07-15T13:17:42Z) - DCA-Bench: A Benchmark for Dataset Curation Agents [9.60250892491588]
We propose a dataset curation agent benchmark, DCA-Bench, to measure large language models' capability of detecting hidden dataset quality issues.
Specifically, we collect diverse real-world dataset quality issues from eight open dataset platforms as a testbed.
The proposed benchmark can also serve as a testbed for measuring the capability of LLMs in problem discovery rather than just problem-solving.
arXiv Detail & Related papers (2024-06-11T14:02:23Z) - Zero-Shot Topic Classification of Column Headers: Leveraging LLMs for Metadata Enrichment [0.0]
We propose a method to support metadata enrichment using topic annotations generated by three Large Language Models (LLMs): ChatGPT-3.5, GoogleBard, and GoogleGemini.
We evaluate the impact of contextual information (i.e., dataset description) on the classification outcomes.
arXiv Detail & Related papers (2024-03-01T10:01:36Z) - MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization [86.61052121715689]
MatPlotAgent is a model-agnostic framework designed to automate scientific data visualization tasks.
MatPlotBench is a high-quality benchmark consisting of 100 human-verified test cases.
arXiv Detail & Related papers (2024-02-18T04:28:28Z) - Utilising a Large Language Model to Annotate Subject Metadata: A Case
Study in an Australian National Research Data Catalogue [18.325675189960833]
In support of open and reproducible research, there has been a rapidly increasing number of datasets made available for research.
As the availability of datasets increases, it becomes more important to have quality metadata for discovering and reusing them.
This paper proposes to leverage large language models (LLMs) for cost-effective annotation of subject metadata through the LLM-based in-context learning.
arXiv Detail & Related papers (2023-10-17T14:52:33Z) - From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets.
Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z) - MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models [73.86954509967416]
Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks.
This paper presents the first comprehensive MLLM Evaluation benchmark MME.
It measures both perception and cognition abilities on a total of 14 subtasks.
arXiv Detail & Related papers (2023-06-23T09:22:36Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z) - Explaining the Performance of Multi-label Classification Methods with
Data Set Properties [1.1278903078792917]
We present a comprehensive meta-learning study of data sets and methods for multi-label classification (MLC)
Here, we analyze 40 MLC data sets by using 50 meta features describing different properties of the data.
The most prominent meta features that describe the space of MLC data sets are the ones assessing different aspects of the label space.
arXiv Detail & Related papers (2021-06-28T11:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.