Related papers: FungiTastic: A multi-modal dataset and benchmark for image categorization

FungiTastic: A multi-modal dataset and benchmark for image categorization

URL: http://arxiv.org/abs/2408.13632v2
Date: Sun, 27 Oct 2024 20:34:38 GMT
Title: FungiTastic: A multi-modal dataset and benchmark for image categorization
Authors: Lukas Picek, Klara Janouskova, Milan Sulc, Jiri Matas,
Abstract summary: We introduce a new benchmark and a dataset, FungiTastic, based on fungal records continuously collected over a twenty-year span. The dataset is labeled and curated by experts and consists of about 350k multimodal observations of 5k fine-grained categories (species) FungiTastic is one of the few benchmarks that include a test set with DNA-sequenced ground truth of unprecedented label reliability.
Score: 21.01939456569417
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a new, challenging benchmark and a dataset, FungiTastic, based on fungal records continuously collected over a twenty-year span. The dataset is labeled and curated by experts and consists of about 350k multimodal observations of 5k fine-grained categories (species). The fungi observations include photographs and additional data, e.g., meteorological and climatic data, satellite images, and body part segmentation masks. FungiTastic is one of the few benchmarks that include a test set with DNA-sequenced ground truth of unprecedented label reliability. The benchmark is designed to support (i) standard closed-set classification, (ii) open-set classification, (iii) multi-modal classification, (iv) few-shot learning, (v) domain shift, and many more. We provide baseline methods tailored for many use-cases, a multitude of ready-to-use pre-trained models on HuggingFace and a framework for model training. A comprehensive documentation describing the dataset features and the baselines are available at https://bohemianvra.github.io/FungiTastic/ and https://www.kaggle.com/datasets/picekl/fungitastic.

Related papers

Transfer Learning and Mixup for Fine-Grained Few-Shot Fungi Classification [0.0]
This paper presents our approach for the FungiCLEF 2025 competition.<n>It focuses on few-shot fine-grained visual categorization using the FungiTastic Few-Shot dataset.
arXiv Detail & Related papers (2025-07-11T01:21:21Z)
THUNDER: Tile-level Histopathology image UNDERstanding benchmark [32.185038017473396]
THUNDER is a tile-level benchmark for digital pathology foundation models.<n>In this paper, we provide a comprehensive comparison of 23 foundation models on 16 different datasets.
arXiv Detail & Related papers (2025-07-10T15:41:35Z)
FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting [44.33565276128137]
Time Series Forecasting (TSF) is key functionality in numerous fields, including in finance, weather services, and energy management. Foundation models exhibit promising inferencing capabilities in new or unseen data. We propose a new benchmark, FoundTS, to enable thorough and fair evaluation and comparison of such models.
arXiv Detail & Related papers (2024-10-15T17:23:49Z)
Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance. DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator. Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z)
UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell) It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains. In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z)
Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases [8.455991178281469]
We present benchmark-O2O, M2M-Easy, Medium, Hard, an image classification benchmark suite containing spurious correlations between classes and backgrounds. The resulting dataset is of high quality and contains approximately 152k images.
arXiv Detail & Related papers (2023-03-09T18:22:12Z)
MuG: A Multimodal Classification Benchmark on Game Data with Tabular, Textual, and Visual Fields [26.450463943664822]
We propose a multimodal classification benchmark MuG with eight datasets that allows researchers to evaluate and improve their models. We conduct multi-aspect data analysis to provide insights into the benchmark, including label balance ratios, percentages of missing features, distributions of data within each modality, and the correlations between labels and input modalities.
arXiv Detail & Related papers (2023-02-06T18:09:06Z)
VAESim: A probabilistic approach for self-supervised prototype discovery [0.23624125155742057]
We propose an architecture for image stratification based on a conditional variational autoencoder. We use a continuous latent space to represent the continuum of disorders and find clusters during training, which can then be used for image/patient stratification. We demonstrate that our method outperforms baselines in terms of kNN accuracy measured on a classification task against a standard VAE.
arXiv Detail & Related papers (2022-09-25T17:55:31Z)
Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline [80.13652104204691]
In this paper, we construct a large-scale benchmark with high diversity for visible-thermal UAV tracking (VTUAV) We provide a coarse-to-fine attribute annotation, where frame-level attributes are provided to exploit the potential of challenge-specific trackers. In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels.
arXiv Detail & Related papers (2022-04-08T15:22:33Z)
Image Classification on Small Datasets via Masked Feature Mixing [22.105356244579745]
A proposed architecture called ChimeraMix learns a data augmentation by generating compositions of instances. The generative model encodes images in pairs, combines the features guided by a mask, and creates new samples. For evaluation, all methods are trained from scratch without any additional data.
arXiv Detail & Related papers (2022-02-23T16:51:22Z)
Novel Class Discovery in Semantic Segmentation [104.30729847367104]
We introduce a new setting of Novel Class Discovery in Semantic (NCDSS) It aims at segmenting unlabeled images containing new classes given prior knowledge from a labeled set of disjoint classes. In NCDSS, we need to distinguish the objects and background, and to handle the existence of multiple classes within an image. We propose the Entropy-based Uncertainty Modeling and Self-training (EUMS) framework to overcome noisy pseudo-labels.
arXiv Detail & Related papers (2021-12-03T13:31:59Z)
CIM: Class-Irrelevant Mapping for Few-Shot Classification [58.02773394658623]
Few-shot classification (FSC) is one of the most concerned hot issues in recent years. How to appraise the pre-trained FEM is the most crucial focus in the FSC community. We propose a simple, flexible method, dubbed as Class-Irrelevant Mapping (CIM)
arXiv Detail & Related papers (2021-09-07T03:26:24Z)
Free Lunch for Co-Saliency Detection: Context Adjustment [14.688461235328306]
We propose a "cost-free" group-cut-paste (GCP) procedure to leverage images from off-the-shelf saliency detection datasets and synthesize new samples. We collect a novel dataset called Context Adjustment Training. The two variants of our dataset, i.e., CAT and CAT+, consist of 16,750 and 33,500 images, respectively.
arXiv Detail & Related papers (2021-08-04T14:51:37Z)
Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets. This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets. In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z)
Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples. We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models. We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z)
Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets. We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy. Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z)
Deep Mining External Imperfect Data for Chest X-ray Disease Screening [57.40329813850719]
We argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges. We formulate the multi-label disease classification problem as weighted independent binary tasks according to the categories. Our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability.
arXiv Detail & Related papers (2020-06-06T06:48:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.