Towards Ancient Plant Seed Classification: A Benchmark Dataset and Baseline Model
- URL: http://arxiv.org/abs/2512.18247v1
- Date: Sat, 20 Dec 2025 07:18:22 GMT
- Title: Towards Ancient Plant Seed Classification: A Benchmark Dataset and Baseline Model
- Authors: Rui Xing, Runmin Cong, Yingying Wu, Can Wang, Zhongming Tang, Fen Wang, Hao Wu, Sam Kwong,
- Abstract summary: We construct the first Ancient Plant Seed Image Classification dataset.<n>It contains 8,340 images from 17 genus- or species-level seed categories excavated from 18 archaeological sites across China.<n>In both quantitative and qualitative analyses, our approach surpasses existing state-of-the-art image classification methods, achieving an accuracy of 90.5%.
- Score: 62.98256440452042
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the dietary preferences of ancient societies and their evolution across periods and regions is crucial for revealing human-environment interactions. Seeds, as important archaeological artifacts, represent a fundamental subject of archaeobotanical research. However, traditional studies rely heavily on expert knowledge, which is often time-consuming and inefficient. Intelligent analysis methods have made progress in various fields of archaeology, but there remains a research gap in data and methods in archaeobotany, especially in the classification task of ancient plant seeds. To address this, we construct the first Ancient Plant Seed Image Classification (APS) dataset. It contains 8,340 images from 17 genus- or species-level seed categories excavated from 18 archaeological sites across China. In addition, we design a framework specifically for the ancient plant seed classification task (APSNet), which introduces the scale feature (size) of seeds based on learning fine-grained information to guide the network in discovering key "evidence" for sufficient classification. Specifically, we design a Size Perception and Embedding (SPE) module in the encoder part to explicitly extract size information for the purpose of complementing fine-grained information. We propose an Asynchronous Decoupled Decoding (ADD) architecture based on traditional progressive learning to decode features from both channel and spatial perspectives, enabling efficient learning of discriminative features. In both quantitative and qualitative analyses, our approach surpasses existing state-of-the-art image classification methods, achieving an accuracy of 90.5%. This demonstrates that our work provides an effective tool for large-scale, systematic archaeological research.
Related papers
- Machine learning applications in archaeological practices: a review [0.0]
We reviewed 135 articles published between 1997 and 2022.<n> Automatic structure detection and artefact classification were the most represented tasks.<n>We observed, in some cases, poorly defined requirements and caveats of the machine learning methods used.
arXiv Detail & Related papers (2025-01-07T14:50:05Z) - PyPotteryLens: An Open-Source Deep Learning Framework for Automated Digitisation of Archaeological Pottery Documentation [0.0]
PyPotteryLens is a framework that automates the digitisation and processing of archaeological pottery drawings from published sources.<n>The framework achieves over 97% precision and recall in pottery detection and classification tasks.<n>It reduces processing time by up to 5x to 20x compared to manual methods.
arXiv Detail & Related papers (2024-12-16T09:01:32Z) - A Closer Look at Deep Learning Methods on Tabular Datasets [78.61845513154502]
We present an extensive study on TALENT, a collection of 300+ datasets spanning broad ranges of size.<n>Our evaluation shows that ensembling benefits both tree-based and neural approaches.
arXiv Detail & Related papers (2024-07-01T04:24:07Z) - Species196: A One-Million Semi-supervised Dataset for Fine-grained
Species Recognition [30.327642724046903]
Species196 is a large-scale semi-supervised dataset of 196-category invasive species.
It collects over 19K images with expert-level accurate annotations Species196-L, and 1.2M unlabeled images of invasive species Species196-U.
arXiv Detail & Related papers (2023-09-25T14:46:01Z) - A Step Towards Worldwide Biodiversity Assessment: The BIOSCAN-1M Insect
Dataset [18.211840156134784]
This paper presents a curated million-image dataset, primarily to train computer-vision models capable of providing image-based taxonomic assessment.
The dataset also presents compelling characteristics, the study of which would be of interest to the broader machine learning community.
arXiv Detail & Related papers (2023-07-19T20:54:08Z) - Multi-Granularity Archaeological Dating of Chinese Bronze Dings Based on
a Knowledge-Guided Relation Graph [5.359415272318481]
Current archaeology depends on trained experts to carry out bronze dating.
We propose a learning-based approach to integrate advanced deep learning techniques and archaeological knowledge.
arXiv Detail & Related papers (2023-03-27T14:54:50Z) - Semantic Segmentation of Vegetation in Remote Sensing Imagery Using Deep
Learning [77.34726150561087]
We propose an approach for creating a multi-modal and large-temporal dataset comprised of publicly available Remote Sensing data.
We use Convolutional Neural Networks (CNN) models that are capable of separating different classes of vegetation.
arXiv Detail & Related papers (2022-09-28T18:51:59Z) - Geo-Spatiotemporal Features and Shape-Based Prior Knowledge for
Fine-grained Imbalanced Data Classification [63.916371837696396]
Fine-grained classification aims at distinguishing between items with similar global perception and patterns, but that differ by minute details.
Our primary challenges come from both small inter-class variations and large intra-class variations.
We propose to combine several innovations to improve fine-grained classification within the use-case of wildlife.
arXiv Detail & Related papers (2021-03-21T02:01:38Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species.
A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.