Benchmarking In-the-wild Multimodal Disease Recognition and A Versatile Baseline
- URL: http://arxiv.org/abs/2408.03120v1
- Date: Tue, 6 Aug 2024 11:49:13 GMT
- Title: Benchmarking In-the-wild Multimodal Disease Recognition and A Versatile Baseline
- Authors: Tianqi Wei, Zhi Chen, Zi Huang, Xin Yu,
- Abstract summary: We propose an in-the-wild multimodal plant disease recognition dataset.
It contains the largest number of disease classes but also text-based descriptions for each disease.
Our proposed dataset can be regarded as an ideal testbed for evaluating disease recognition methods in the real world.
- Score: 42.49727243388804
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing plant disease classification models have achieved remarkable performance in recognizing in-laboratory diseased images. However, their performance often significantly degrades in classifying in-the-wild images. Furthermore, we observed that in-the-wild plant images may exhibit similar appearances across various diseases (i.e., small inter-class discrepancy) while the same diseases may look quite different (i.e., large intra-class variance). Motivated by this observation, we propose an in-the-wild multimodal plant disease recognition dataset that contains the largest number of disease classes but also text-based descriptions for each disease. Particularly, the newly provided text descriptions are introduced to provide rich information in textual modality and facilitate in-the-wild disease classification with small inter-class discrepancy and large intra-class variance issues. Therefore, our proposed dataset can be regarded as an ideal testbed for evaluating disease recognition methods in the real world. In addition, we further present a strong yet versatile baseline that models text descriptions and visual data through multiple prototypes for a given class. By fusing the contributions of multimodal prototypes in classification, our baseline can effectively address the small inter-class discrepancy and large intra-class variance issues. Remarkably, our baseline model can not only classify diseases but also recognize diseases in few-shot or training-free scenarios. Extensive benchmarking results demonstrate that our proposed in-the-wild multimodal dataset sets many new challenges to the plant disease recognition task and there is a large space to improve for future works.
Related papers
- Cross- and Intra-image Prototypical Learning for Multi-label Disease Diagnosis and Interpretation [15.303610605543746]
We present a novel Cross- and Intra-image Prototypical Learning framework, for accurate multi-label disease diagnosis and interpretation from medical images.
We propose a new two-level alignment-based regularisation strategy that effectively leverages consistent intra-image information to enhance interpretation robustness and predictive performance.
arXiv Detail & Related papers (2024-11-07T10:46:01Z) - PMP-Swin: Multi-Scale Patch Message Passing Swin Transformer for Retinal
Disease Classification [9.651435376561741]
We propose a new framework named Multi-Scale Patch Message Passing Swin Transformer for multi-class retinal disease classification.
Specifically, we design a Patch Message Passing (PMP) module based on the Message Passing mechanism to establish global interaction for pathological semantic features.
arXiv Detail & Related papers (2023-11-20T11:09:09Z) - Multi-task Explainable Skin Lesion Classification [54.76511683427566]
We propose a few-shot-based approach for skin lesions that generalizes well with few labelled data.
The proposed approach comprises a fusion of a segmentation network that acts as an attention module and classification network.
arXiv Detail & Related papers (2023-10-11T05:49:47Z) - Hierarchical Knowledge Guided Learning for Real-world Retinal Diseases
Recognition [20.88407972858568]
Some recently published datasets in ophthalmology AI consist of more than 40 kinds of retinal diseases with complex abnormalities and variable morbidity.
From a modeling perspective, most deep learning models trained on these datasets may lack the ability to generalize to rare diseases.
This paper presents a novel method that enables the deep neural network to learn from a long-tailed fundus database for various retinal disease recognition.
arXiv Detail & Related papers (2021-11-17T05:44:39Z) - Relational Subsets Knowledge Distillation for Long-tailed Retinal
Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge.
It enforces the model to focus on learning the subset-specific knowledge.
The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Multi-label Thoracic Disease Image Classification with Cross-Attention
Networks [65.37531731899837]
We propose a novel scheme of Cross-Attention Networks (CAN) for automated thoracic disease classification from chest x-ray images.
We also design a new loss function that beyond cross-entropy loss to help cross-attention process and is able to overcome the imbalance between classes and easy-dominated samples within each class.
arXiv Detail & Related papers (2020-07-21T14:37:00Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z) - Synergic Adversarial Label Learning for Grading Retinal Diseases via
Knowledge Distillation and Multi-task Learning [29.46896757506273]
Well-qualified doctors annotated images are very expensive and only a limited amount of data is available for various retinal diseases.
Some studies show that AMD and DR share some common features like hemorrhagic points and exudation but most classification algorithms only train those disease models independently.
We propose a method called synergic adversarial label learning (SALL) which leverages relevant retinal disease labels in both semantic and feature space as additional signals and train the model in a collaborative manner.
arXiv Detail & Related papers (2020-03-24T01:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.