Producing Plankton Classifiers that are Robust to Dataset Shift
- URL: http://arxiv.org/abs/2401.14256v1
- Date: Thu, 25 Jan 2024 15:47:18 GMT
- Title: Producing Plankton Classifiers that are Robust to Dataset Shift
- Authors: Cheng Chen, Sreenath Kyathanahally, Marta Reyes, Stefanie Merkli, Ewa
Merz, Emanuele Francazi, Marvin Hoege, Francesco Pomati, Marco Baity-Jesi
- Abstract summary: We integrate ZooLake dataset with manually-annotated images from 10 independent days of deployment to benchmark Out-Of-Dataset (OOD) performances.
We propose a preemptive assessment method to identify potential pitfalls when classifying new data, and pinpoint features in OOD images that adversely impact classification.
We find that ensembles of BEiT vision transformers, with targeted augmentations addressing OOD robustness, geometric ensembling, and rotation-based test-time augmentation, constitute the most robust model, which we call BEsT model.
- Score: 1.716364772047407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern plankton high-throughput monitoring relies on deep learning
classifiers for species recognition in water ecosystems. Despite satisfactory
nominal performances, a significant challenge arises from Dataset Shift, which
causes performances to drop during deployment. In our study, we integrate the
ZooLake dataset with manually-annotated images from 10 independent days of
deployment, serving as test cells to benchmark Out-Of-Dataset (OOD)
performances. Our analysis reveals instances where classifiers, initially
performing well in In-Dataset conditions, encounter notable failures in
practical scenarios. For example, a MobileNet with a 92% nominal test accuracy
shows a 77% OOD accuracy. We systematically investigate conditions leading to
OOD performance drops and propose a preemptive assessment method to identify
potential pitfalls when classifying new data, and pinpoint features in OOD
images that adversely impact classification. We present a three-step pipeline:
(i) identifying OOD degradation compared to nominal test performance, (ii)
conducting a diagnostic analysis of degradation causes, and (iii) providing
solutions. We find that ensembles of BEiT vision transformers, with targeted
augmentations addressing OOD robustness, geometric ensembling, and
rotation-based test-time augmentation, constitute the most robust model, which
we call BEsT model. It achieves an 83% OOD accuracy, with errors concentrated
on container classes. Moreover, it exhibits lower sensitivity to dataset shift,
and reproduces well the plankton abundances. Our proposed pipeline is
applicable to generic plankton classifiers, contingent on the availability of
suitable test cells. By identifying critical shortcomings and offering
practical procedures to fortify models against dataset shift, our study
contributes to the development of more reliable plankton classification
technologies.
Related papers
- Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving [55.93813178692077]
We present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms.
We assess 33 state-of-the-art BEV-based perception models spanning tasks like detection, map segmentation, depth estimation, and occupancy prediction.
Our experimental results also underline the efficacy of strategies like pre-training and depth-free BEV transformations in enhancing robustness against out-of-distribution data.
arXiv Detail & Related papers (2024-05-27T17:59:39Z) - Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning [5.438725298163702]
Contrastive Self-Supervised Learning (SSL) offers a potential solution to labeled data scarcity.
We propose uncovering the optimal augmentations for applying contrastive learning in 1D phonocardiogram (PCG) classification.
We demonstrate that depending on its training distribution, the effectiveness of a fully-supervised model can degrade up to 32%, while SSL models only lose up to 10% or even improve in some cases.
arXiv Detail & Related papers (2023-12-01T11:06:00Z) - Exploring the Physical World Adversarial Robustness of Vehicle Detection [13.588120545886229]
Adrial attacks can compromise the robustness of real-world detection models.
We propose an innovative instant-level data generation pipeline using the CARLA simulator.
Our findings highlight diverse model performances under adversarial conditions.
arXiv Detail & Related papers (2023-08-07T11:09:12Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal
Relationships [8.679073301435265]
We construct a new benchmark for evaluating and improving model robustness by applying perturbations to existing data.
We use these labels to perturb the data by deleting non-causal agents from the scene.
Under non-causal perturbations, we observe a $25$-$38%$ relative change in minADE as compared to the original.
arXiv Detail & Related papers (2022-07-07T21:28:23Z) - Efficient Test-Time Model Adaptation without Forgetting [60.36499845014649]
Test-time adaptation seeks to tackle potential distribution shifts between training and testing data.
We propose an active sample selection criterion to identify reliable and non-redundant samples.
We also introduce a Fisher regularizer to constrain important model parameters from drastic changes.
arXiv Detail & Related papers (2022-04-06T06:39:40Z) - Understanding and Testing Generalization of Deep Networks on
Out-of-Distribution Data [30.471871571256198]
Deep network models perform excellently on In-Distribution data, but can significantly fail on Out-Of-Distribution data.
This study is devoted to analyzing the problem of experimental ID test and designing OOD test paradigm.
arXiv Detail & Related papers (2021-11-17T15:29:07Z) - Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector.
Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.