Uncovering Latent Connections in Indigenous Heritage: Semantic Pipelines for Cultural Preservation in Brazil
- URL: http://arxiv.org/abs/2508.10911v1
- Date: Thu, 31 Jul 2025 21:09:36 GMT
- Title: Uncovering Latent Connections in Indigenous Heritage: Semantic Pipelines for Cultural Preservation in Brazil
- Authors: Luis Vitor Zerkowski, Nina S. T. Hirata,
- Abstract summary: In Brazil, the Museu Nacional dos Povos Indigenas hosts the country's largest online collection of Indigenous objects and iconographies.<n>We present a data-driven initiative that applies artificial intelligence to enhance accessibility, interpretation, and exploration.
- Score: 0.36832029288386137
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Indigenous communities face ongoing challenges in preserving their cultural heritage, particularly in the face of systemic marginalization and urban development. In Brazil, the Museu Nacional dos Povos Indigenas through the Tainacan platform hosts the country's largest online collection of Indigenous objects and iconographies, providing a critical resource for cultural engagement. Using publicly available data from this repository, we present a data-driven initiative that applies artificial intelligence to enhance accessibility, interpretation, and exploration. We develop two semantic pipelines: a visual pipeline that models image-based similarity and a textual pipeline that captures semantic relationships from item descriptions. These embedding spaces are projected into two dimensions and integrated into an interactive visualization tool we also developed. In addition to similarity-based navigation, users can explore the collection through temporal and geographic lenses, enabling both semantic and contextualized perspectives. The system supports curatorial tasks, aids public engagement, and reveals latent connections within the collection. This work demonstrates how AI can ethically contribute to cultural preservation practices.
Related papers
- MINGLE: VLMs for Semantically Complex Region Detection in Urban Scenes [49.89767522399176]
Group-level social interactions in public spaces are crucial for urban planning.<n>We introduce a social group region detection task, which requires inferring and spatially grounding visual regions defined by interpersonal relations.<n>We propose MINGLE, a modular three-stage pipeline that integrates human detection and depth estimation, VLM-based reasoning to classify pairwise social affiliation, and a lightweight spatial aggregation algorithm to localize socially connected groups.<n>We present a new dataset of 100K urban street-view images annotated with bounding boxes and labels for both individuals and socially interacting groups.
arXiv Detail & Related papers (2025-09-16T19:31:40Z) - A Multidimensional AI-powered Framework for Analyzing Tourist Perception in Historic Urban Quarters: A Case Study in Shanghai [5.077286019454655]
This study proposes a multidimensional AI-powered framework for analyzing tourist perception in historic urban quarters.<n> Applied to twelve historic quarters in central Shanghai, the framework integrates focal point extraction, color theme analysis, and sentiment mining.
arXiv Detail & Related papers (2025-09-04T02:35:14Z) - Animer une base de connaissance: des ontologies aux mod{è}les d'I.A. g{é}n{é}rative [0.0]
Article proposes a reading of the hybridization between symbolic AI and neural (or sub-symbolic) AI based on a field of application.<n>We describe the LaCAS ecosystem -- Open Archives in Linguistic and Cultural Studies.<n>We illustrate this approach using the knowledge domain ''Languages of the world'' (540 languages) and the knowledge object ''Quechua (language)''
arXiv Detail & Related papers (2025-09-01T09:40:55Z) - Position Paper: Metadata Enrichment Model: Integrating Neural Networks and Semantic Knowledge Graphs for Cultural Heritage Applications [8.732274235941974]
We present the Metadata Enrichment Model (MEM), a conceptual framework designed to enrich metadata for digitized collections.<n>MEM combines fine-tuned computer vision models, large language models and structured knowledge graphs.<n>We release a dataset of digitized incunabula from the Jagiellonian Digital Library.
arXiv Detail & Related papers (2025-05-29T15:22:18Z) - Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts [65.90535970515266]
TimeTravel is a benchmark of 10,250 expert-verified samples spanning 266 distinct cultures across 10 major historical regions.<n>TimeTravel is designed for AI-driven analysis of manuscripts, artworks, inscriptions, and archaeological discoveries.<n>We evaluate contemporary AI models on TimeTravel, highlighting their strengths and identifying areas for improvement.
arXiv Detail & Related papers (2025-02-20T18:59:51Z) - The Human Labour of Data Work: Capturing Cultural Diversity through World Wide Dishes [3.770155074442168]
We present an example of participatory dataset creation, where community members both guide the design of the research process and contribute to the crowdsourced dataset.<n>We show that our approach can result in curated, high-quality data that supports decentralised contributions from communities.<n>We surface three dimensions of labour performed by participatory mediators that are crucial for participatory dataset construction.
arXiv Detail & Related papers (2025-02-09T17:09:46Z) - Unlocking Comics: The AI4VA Dataset for Visual Understanding [62.345344799258804]
This paper presents a novel dataset comprising Franco-Belgian comics from the 1950s annotated for tasks including depth estimation, semantic segmentation, saliency detection, and character identification.
It consists of two distinct and consistent styles and incorporates object concepts and labels taken from natural images.
By including such diverse information across styles, this dataset not only holds promise for computational creativity but also offers avenues for the digitization of art and storytelling innovation.
arXiv Detail & Related papers (2024-10-27T14:27:05Z) - KamerRaad: Enhancing Information Retrieval in Belgian National Politics through Hierarchical Summarization and Conversational Interfaces [55.00702535694059]
KamerRaad is an AI tool that leverages large language models to help citizens interactively engage with Belgian political information.
The tool extracts and concisely summarizes key excerpts from parliamentary proceedings, followed by the potential for interaction based on generative AI.
arXiv Detail & Related papers (2024-04-22T15:01:39Z) - Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future [59.78608958395464]
We build a Social AI Data Infrastructure, which consists of a comprehensive social AI taxonomy and a data library of 480 NLP datasets.
Our infrastructure allows us to analyze existing dataset efforts, and also evaluate language models' performance in different social intelligence aspects.
We show there is a need for multifaceted datasets, increased diversity in language and culture, more long-tailed social situations, and more interactive data in future social intelligence data efforts.
arXiv Detail & Related papers (2024-02-28T00:22:42Z) - Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking [48.21982147529661]
This paper introduces a novel approach for massively multicultural knowledge acquisition.
Our method strategically navigates from densely informative Wikipedia documents on cultural topics to an extensive network of linked pages.
Our work marks an important step towards deeper understanding and bridging the gaps of cultural disparities in AI.
arXiv Detail & Related papers (2024-02-14T18:16:54Z) - From Pampas to Pixels: Fine-Tuning Diffusion Models for Ga\'ucho
Heritage [0.0]
This paper addresses the potential of Latent Diffusion Models (LDMs) in representing local cultural concepts, historical figures, and endangered species.
Our objective is to contribute to the broader understanding of how generative models can help to capture and preserve the cultural and historical identity of regions.
arXiv Detail & Related papers (2024-01-10T19:34:52Z) - Heri-Graphs: A Workflow of Creating Datasets for Multi-modal Machine
Learning on Graphs of Heritage Values and Attributes with Social Media [7.318997639507268]
Values (why to conserve) and Attributes (what to conserve) are essential concepts of cultural heritage.
Recent studies have been using social media to map values and attributes conveyed by public to cultural heritage.
This study presents a methodological workflow for constructing such multi-modal datasets using posts and images on Flickr.
arXiv Detail & Related papers (2022-05-16T09:45:45Z) - Learning Patterns of Tourist Movement and Photography from Geotagged
Photos at Archaeological Heritage Sites in Cuzco, Peru [73.52315464582637]
We build upon the current theoretical discourse of anthropology associated with visuality and heritage tourism to identify travel patterns across a known archaeological heritage circuit in Cuzco, Peru.
Our goals are to (1) understand how the intensification of tourism intersects with heritage regulations and social media, aiding in the articulation of travel patterns across Cuzco's heritage landscape; and to (2) assess how aesthetic preferences and visuality become entangled with the rapidly evolving expectations of tourists, whose travel narratives are curated on social media and grounded in historic site representations.
arXiv Detail & Related papers (2020-06-29T22:49:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.