Related papers: Provenance Analysis of Archaeological Artifacts via Multimodal RAG Systems

Provenance Analysis of Archaeological Artifacts via Multimodal RAG Systems

URL: http://arxiv.org/abs/2509.20769v1
Date: Thu, 25 Sep 2025 05:52:13 GMT
Title: Provenance Analysis of Archaeological Artifacts via Multimodal RAG Systems
Authors: Tuo Zhang, Yuechun Sun, Ruiliang Liu,
Abstract summary: We present a retrieval-augmented generation (RAG)-based system for provenance analysis of archaeological artifacts.<n>The system constructs a dual-modal knowledge base from reference texts and images, enabling raw visual, edge-enhanced, and semantic retrieval.<n>We evaluate the system on a set of Eastern Eurasian Bronze Age artifacts from the British Museum.
Score: 10.02915777208789
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we present a retrieval-augmented generation (RAG)-based system for provenance analysis of archaeological artifacts, designed to support expert reasoning by integrating multimodal retrieval and large vision-language models (VLMs). The system constructs a dual-modal knowledge base from reference texts and images, enabling raw visual, edge-enhanced, and semantic retrieval to identify stylistically similar objects. Retrieved candidates are synthesized by the VLM to generate structured inferences, including chronological, geographical, and cultural attributions, alongside interpretive justifications. We evaluate the system on a set of Eastern Eurasian Bronze Age artifacts from the British Museum. Expert evaluation demonstrates that the system produces meaningful and interpretable outputs, offering scholars concrete starting points for analysis and significantly alleviating the cognitive burden of navigating vast comparative corpora.

Related papers

Retrieval Augmented Generation of Literature-derived Polymer Knowledge: The Example of a Biodegradable Polymer Expert System [4.222675210976564]
Polymer literature contains a large and growing body of experimental knowledge.<n>Much of it is buried in unstructured text and inconsistent terminology.<n>Existing tools typically extract narrow, study-specific facts in isolation.
arXiv Detail & Related papers (2026-02-18T17:46:09Z)
Explain Before You Answer: A Survey on Compositional Visual Reasoning [74.27548620675748]
Compositional visual reasoning has emerged as a key research frontier in multimodal AI.<n>This survey systematically reviews 260+ papers from top venues (CVPR, ICCV, NeurIPS, ICML, ACL, etc.)<n>We then catalog 60+ benchmarks and corresponding metrics that probe compositional visual reasoning along dimensions such as grounding accuracy, chain-of-thought faithfulness, and high-resolution perception.
arXiv Detail & Related papers (2025-08-24T11:01:51Z)
Product of Experts for Visual Generation [60.91134809173301]
We propose a Product of Experts (PoE) framework that performs inference-time knowledge composition from heterogeneous models.<n>Our framework shows practical benefits in image and video synthesis tasks, yielding better controllability than monolithic methods.
arXiv Detail & Related papers (2025-06-10T15:21:14Z)
Federated Retrieval-Augmented Generation: A Systematic Mapping Study [11.931695766266879]
Federated Retrieval-Augmented Generation (Federated RAG) combines Federated Learning (FL) with Retrieval-Augmented Generation (RAG)<n>RAG improves the factual accuracy of language models by grounding outputs in external knowledge.<n>This paper presents the first systematic mapping study of Federated RAG, covering literature published between 2020 and 2025.
arXiv Detail & Related papers (2025-05-24T23:45:12Z)
Advancing AI Research Assistants with Expert-Involved Learning [84.30323604785646]
Large language models (LLMs) and large multimodal models (LMMs) promise to accelerate biomedical discovery, yet their reliability remains unclear.<n>We introduce ARIEL (AI Research Assistant for Expert-in-the-Loop Learning), an open-source evaluation and optimization framework.<n>We find that state-of-the-art models generate fluent but incomplete summaries, whereas LMMs struggle with detailed visual reasoning.
arXiv Detail & Related papers (2025-05-03T14:21:48Z)
Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations [65.11348389219887]
We introduce Dialectic-RAG (DRAG), a modular approach that evaluates retrieved information by comparing, contrasting, and resolving conflicting perspectives.<n>We show the impact of our framework both as an in-context learning strategy and for constructing demonstrations to instruct smaller models.
arXiv Detail & Related papers (2025-04-07T06:55:15Z)
A Survey of Model Architectures in Information Retrieval [59.61734783818073]
The period from 2019 to the present has represented one of the biggest paradigm shifts in information retrieval (IR) and natural language processing (NLP)<n>We trace the development from traditional term-based methods to modern neural approaches, particularly highlighting the impact of transformer-based models and subsequent large language models (LLMs)<n>We conclude with a forward-looking discussion of emerging challenges and future directions.
arXiv Detail & Related papers (2025-02-20T18:42:58Z)
OracleSage: Towards Unified Visual-Linguistic Understanding of Oracle Bone Scripts through Cross-Modal Knowledge Fusion [19.788896054132053]
Oracle bone script (OBS), as China's earliest mature writing system, present significant challenges in automatic recognition.<n>We introduce OracleSage, a novel cross-modal framework that integrates hierarchical visual understanding with graph-based semantic reasoning.
arXiv Detail & Related papers (2024-11-26T19:26:06Z)
Multi-Dialectal Representation Learning of Sinitic Phonology [0.0]
In Sinitic Historical Phonology, notable tasks that could benefit from machine learning include the comparison of dialects and reconstruction of proto-languages systems. Motivated by this, this paper provides an approach for obtaining multi-dialectal representations of Sinitic syllables.
arXiv Detail & Related papers (2023-06-30T02:37:25Z)
Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis [89.04041100520881]
This research proposes to retrieve textual and visual evidence based on the object, sentence, and whole image. We develop a novel approach to synthesize the object-level, image-level, and sentence-level information for better reasoning between the same and different modalities.
arXiv Detail & Related papers (2023-05-25T15:26:13Z)
Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information. KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based. Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z)
The computerization of archaeology: survey on AI techniques [6.985152632198481]
This paper analyses the application of artificial intelligence techniques to various areas of archaeology and more specifically: a) The use of software tools as a creative stimulus for the organization of exhibitions;. The classification of fragments found in archaeological excavations and for the reconstruction of ceramics;. The cataloguing and study of human remains to understand the social and historical context of belonging;. The design of a study for the exploration of marine archaeological sites, located at depths that cannot be reached by man, through the construction of a freely explorable 3D version.
arXiv Detail & Related papers (2020-05-05T17:09:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.