A Generative AI-driven Metadata Modelling Approach
- URL: http://arxiv.org/abs/2501.04008v1
- Date: Fri, 13 Dec 2024 09:26:04 GMT
- Title: A Generative AI-driven Metadata Modelling Approach
- Authors: Mayukh Bagchi,
- Abstract summary: This paper proposes a Generative AI-driven Human-Large Language Model (LLM) collaboration based metadata modelling approach to disentangle the entanglement inherent in each representation level leading to the generation of a conceptually disentangled metadata model.
- Score: 1.450405446885067
- License:
- Abstract: Since decades, the modelling of metadata has been core to the functioning of any academic library. Its importance has only enhanced with the increasing pervasiveness of Generative Artificial Intelligence (AI)-driven information activities and services which constitute a library's outreach. However, with the rising importance of metadata, there arose several outstanding problems with the process of designing a library metadata model impacting its reusability, crosswalk and interoperability with other metadata models. This paper posits that the above problems stem from an underlying thesis that there should only be a few core metadata models which would be necessary and sufficient for any information service using them, irrespective of the heterogeneity of intra-domain or inter-domain settings. To that end, this paper advances a contrary view of the above thesis and substantiates its argument in three key steps. First, it introduces a novel way of thinking about a library metadata model as an ontology-driven composition of five functionally interlinked representation levels from perception to its intensional definition via properties. Second, it introduces the representational manifoldness implicit in each of the five levels which cumulatively contributes to a conceptually entangled library metadata model. Finally, and most importantly, it proposes a Generative AI-driven Human-Large Language Model (LLM) collaboration based metadata modelling approach to disentangle the entanglement inherent in each representation level leading to the generation of a conceptually disentangled metadata model. Throughout the paper, the arguments are exemplified by motivating scenarios and examples from representative libraries handling cancer information.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA)
CEFA consists of a feature alignment module and a context enhancement module.
Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z) - Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Has Your Pretrained Model Improved? A Multi-head Posterior Based
Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models.
We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models.
Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z) - Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR)
It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - T-METASET: Task-Aware Generation of Metamaterial Datasets by
Diversity-Based Active Learning [14.668178146934588]
We propose t-METASET: an intelligent data acquisition framework for task-aware dataset generation.
We validate the proposed framework in three hypothetical deployment scenarios, which encompass general use, task-aware use, and tailorable use.
arXiv Detail & Related papers (2022-02-21T22:46:49Z) - Mapping the Internet: Modelling Entity Interactions in Complex
Heterogeneous Networks [0.0]
We propose a versatile, unified framework called HMill' for sample representation, model definition and training.
We show an extension of the universal approximation theorem to the set of all functions realized by models implemented in the framework.
We solve three different problems from the cybersecurity domain using the framework.
arXiv Detail & Related papers (2021-04-19T21:32:44Z) - Learning Abstract Task Representations [0.6690874707758511]
We propose a method to induce new abstract meta-features as latent variables in a deep neural network.
We demonstrate our methodology using a deep neural network as a feature extractor.
arXiv Detail & Related papers (2021-01-19T20:31:02Z) - Learning to Generalize Unseen Domains via Memory-based Multi-Source
Meta-Learning for Person Re-Identification [59.326456778057384]
We propose the Memory-based Multi-Source Meta-Learning framework to train a generalizable model for unseen domains.
We also present a meta batch normalization layer (MetaBN) to diversify meta-test features.
Experiments demonstrate that our M$3$L can effectively enhance the generalization ability of the model for unseen domains.
arXiv Detail & Related papers (2020-12-01T11:38:16Z) - Unsupervised Meta-Learning through Latent-Space Interpolation in
Generative Models [11.943374020641214]
We describe an approach that generates meta-tasks using generative models.
We find that the proposed approach, LAtent Space Interpolation Unsupervised Meta-learning (LASIUM), outperforms or is competitive with current unsupervised learning baselines.
arXiv Detail & Related papers (2020-06-18T02:10:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.