Related papers: LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images

LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images

URL: http://arxiv.org/abs/2403.13171v2
Date: Wed, 27 Mar 2024 21:43:37 GMT
Title: LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
Authors: Jing Zhang, Irving Fang, Juexiao Zhang, Hao Wu, Akshat Kaushik, Alice Rodriguez, Hanwen Zhao, Zhuo Zheng, Radu Iovita, Chen Feng,
Abstract summary: Lithic Use-Wear Analysis (LUWA) using microscopic images is an underexplored vision-for-science research area. It seeks to distinguish the worked material, which is critical for understanding archaeological artifacts, material interactions, tool functionalities, and dental records. We build the first open-source and the largest LUWA dataset containing 23,130 microscopic images with different magnifications and sensing modalities.
Score: 10.764141557655442
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Lithic Use-Wear Analysis (LUWA) using microscopic images is an underexplored vision-for-science research area. It seeks to distinguish the worked material, which is critical for understanding archaeological artifacts, material interactions, tool functionalities, and dental records. However, this challenging task goes beyond the well-studied image classification problem for common objects. It is affected by many confounders owing to the complex wear mechanism and microscopic imaging, which makes it difficult even for human experts to identify the worked material successfully. In this paper, we investigate the following three questions on this unique vision task for the first time:(i) How well can state-of-the-art pre-trained models (like DINOv2) generalize to the rarely seen domain? (ii) How can few-shot learning be exploited for scarce microscopic images? (iii) How do the ambiguous magnification and sensing modality influence the classification accuracy? To study these, we collaborated with archaeologists and built the first open-source and the largest LUWA dataset containing 23,130 microscopic images with different magnifications and sensing modalities. Extensive experiments show that existing pre-trained models notably outperform human experts but still leave a large gap for improvements. Most importantly, the LUWA dataset provides an underexplored opportunity for vision and learning communities and complements existing image classification problems on common objects.

Related papers

Challenging Vision-Language Models with Surgical Data: A New Dataset and Broad Benchmarking Study [0.6120768859742071]
We present the first large-scale study assessing the capabilities of Vision Language Models (VLMs) for endoscopic tasks.<n>Using a diverse set of state-of-the-art models, multiple surgical datasets, and extensive human reference annotations, we address three key research questions.<n>Our results reveal that VLMs can effectively perform basic surgical perception tasks, such as object counting and localization, with performance levels comparable to general domain tasks.
arXiv Detail & Related papers (2025-06-06T16:53:12Z)
ViKL: A Mammography Interpretation Framework via Multimodal Aggregation of Visual-knowledge-linguistic Features [54.37042005469384]
We announce MVKL, the first multimodal mammography dataset encompassing multi-view images, detailed manifestations and reports. Based on this dataset, we focus on the challanging task of unsupervised pretraining. We propose ViKL, a framework that synergizes Visual, Knowledge, and Linguistic features.
arXiv Detail & Related papers (2024-09-24T05:01:23Z)
Beyond Human Vision: The Role of Large Vision Language Models in Microscope Image Analysis [12.432542525489236]
Vision language models (VLMs) have recently emerged and gained the spotlight for their ability to comprehend the dual modality of image and textual data. In this study, we charge ChatGPT, LLaVA, Gemini, and SAM with classification, segmentation, counting, and VQA tasks on a variety of microscopy images. We observe that ChatGPT and Gemini are impressively able to comprehend the visual features in microscopy images, while SAM is quite capable at isolating artefacts in a general sense.
arXiv Detail & Related papers (2024-05-01T21:35:04Z)
Knowledge-enhanced Visual-Language Pretraining for Computational Pathology [68.6831438330526]
We consider the problem of visual representation learning for computational pathology, by exploiting large-scale image-text pairs gathered from public resources. We curate a pathology knowledge tree that consists of 50,470 informative attributes for 4,718 diseases requiring pathology diagnosis from 32 human tissues.
arXiv Detail & Related papers (2024-04-15T17:11:25Z)
Multimodal Deep Learning for Scientific Imaging Interpretation [0.0]
This study presents a novel methodology to linguistically emulate and evaluate human-like interactions with Scanning Electron Microscopy (SEM) images. Our approach distills insights from both textual and visual data harvested from peer-reviewed articles. Our model (GlassLLaVA) excels in crafting accurate interpretations, identifying key features, and detecting defects in previously unseen SEM images.
arXiv Detail & Related papers (2023-09-21T20:09:22Z)
Optimizations of Autoencoders for Analysis and Classification of Microscopic In Situ Hybridization Images [68.8204255655161]
We propose a deep-learning framework to detect and classify areas of microscopic images with similar levels of gene expression. The data we analyze requires an unsupervised learning model for which we employ a type of Artificial Neural Network - Deep Learning Autoencoders.
arXiv Detail & Related papers (2023-04-19T13:45:28Z)
Causal Triplet: An Open Challenge for Intervention-centric Causal Representation Learning [98.78136504619539]
Causal Triplet is a causal representation learning benchmark featuring visually more complex scenes. We show that models built with the knowledge of disentangled or object-centric representations significantly outperform their distributed counterparts.
arXiv Detail & Related papers (2023-01-12T17:43:38Z)
GraVIS: Grouping Augmented Views from Independent Sources for Dermatology Analysis [52.04899592688968]
We propose GraVIS, which is specifically optimized for learning self-supervised features from dermatology images. GraVIS significantly outperforms its transfer learning and self-supervised learning counterparts in both lesion segmentation and disease classification tasks.
arXiv Detail & Related papers (2023-01-11T11:38:37Z)
Towards better understanding and better generalization of few-shot classification in histology images with contrastive learning [7.620702640026243]
Few-shot learning is an established topic in natural images for years, but few work is attended to histology images. We propose to incorporate contrastive learning (CL) with latent augmentation (LA) to build a few-shot system. In experiments, we find i) models learned by CL generalize better than supervised learning for histology images in unseen classes, and ii) LA brings consistent gains over baselines.
arXiv Detail & Related papers (2022-02-18T07:48:34Z)
Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types [50.1843146606122]
A simple form of transfer learning is common in current state-of-the-art computer vision models. Previous systematic studies of transfer learning have been limited and the circumstances in which it is expected to work are not fully understood. In this paper we carry out an extensive experimental exploration of transfer learning across vastly different image domains.
arXiv Detail & Related papers (2021-03-24T16:24:20Z)
Microscopic fine-grained instance classification through deep attention [7.50282814989294]
Fine-grained classification of microscopic image data with limited samples is an open problem in computer vision and biomedical imaging. We propose a simple yet effective deep network that performs two tasks simultaneously in an end-to-end manner. The result is a robust but lightweight end-to-end trainable deep network that yields state-of-the-art results.
arXiv Detail & Related papers (2020-10-06T15:29:58Z)
Supervision and Source Domain Impact on Representation Learning: A Histopathology Case Study [6.762603053858596]
In this work, we explored the performance of a deep neural network and triplet loss in the area of representation learning. We investigated the notion of similarity and dissimilarity in pathology whole-slide images and compared different setups from unsupervised and semi-supervised to supervised learning. We achieved high accuracy and generalization when the learned representations were applied to two different pathology datasets.
arXiv Detail & Related papers (2020-05-10T21:27:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.