Calibrated prediction in and out-of-domain for state-of-the-art saliency
modeling
- URL: http://arxiv.org/abs/2105.12441v2
- Date: Thu, 27 May 2021 15:21:50 GMT
- Title: Calibrated prediction in and out-of-domain for state-of-the-art saliency
modeling
- Authors: Akis Linardos, Matthias K\"ummerer, Ori Press, Matthias Bethge
- Abstract summary: We conduct a large-scale transfer learning study which tests different ImageNet backbones.
By replacing the VGG19 backbone of DeepGaze II with ResNet50 features we improve the performance on saliency prediction from 78% to 85%.
We show that by combining multiple backbones in a principled manner a good confidence calibration on unseen datasets can be achieved.
- Score: 17.739797071488212
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Since 2014 transfer learning has become the key driver for the improvement of
spatial saliency prediction; however, with stagnant progress in the last 3-5
years. We conduct a large-scale transfer learning study which tests different
ImageNet backbones, always using the same read out architecture and learning
protocol adopted from DeepGaze II. By replacing the VGG19 backbone of DeepGaze
II with ResNet50 features we improve the performance on saliency prediction
from 78% to 85%. However, as we continue to test better ImageNet models as
backbones (such as EfficientNetB5) we observe no additional improvement on
saliency prediction. By analyzing the backbones further, we find that
generalization to other datasets differs substantially, with models being
consistently overconfident in their fixation predictions. We show that by
combining multiple backbones in a principled manner a good confidence
calibration on unseen datasets can be achieved. This yields a significant leap
in benchmark performance in and out-of-domain with a 15 percent point
improvement over DeepGaze II to 93% on MIT1003, marking a new state of the art
on the MIT/Tuebingen Saliency Benchmark in all available metrics (AUC: 88.3%,
sAUC: 79.4%, CC: 82.4%).
Related papers
- Transfer Learning-Based CNN Models for Plant Species Identification Using Leaf Venation Patterns [0.0]
This study evaluates the efficacy of three deep learning architectures: ResNet50, MobileNetV2, and EfficientNetB0 for automated plant species classification based on leaf venation patterns.
arXiv Detail & Related papers (2025-09-03T21:23:09Z) - Enhancing Crop Segmentation in Satellite Image Time Series with Transformer Networks [1.339000056057208]
This paper presents a revised version of the Transformer-based Swin UNETR model, specifically adapted for crop segmentation of Satellite Image Time Series (SITS)
The proposed model demonstrates significant advancements, achieving a validation accuracy of 96.14% and a test accuracy of 95.26% on the Munich dataset.
Experiments of this study indicate that the model will likely achieve comparable or superior accuracy to CNNs while requiring significantly less training time.
arXiv Detail & Related papers (2024-12-02T20:08:22Z) - An Augmentation-based Model Re-adaptation Framework for Robust Image Segmentation [0.799543372823325]
We propose an Augmentation-based Model Re-adaptation Framework (AMRF) to enhance the generalisation of segmentation models.
By observing segmentation masks from conventional models (FCN and U-Net) and a pre-trained SAM model, we determine a minimal augmentation set that optimally balances training efficiency and model performance.
Our results demonstrate that the fine-tuned FCN surpasses its baseline by 3.29% and 3.02% in cropping accuracy, and 5.27% and 4.04% in classification accuracy on two temporally continuous datasets.
arXiv Detail & Related papers (2024-09-14T21:01:49Z) - Improved Adaboost Algorithm for Web Advertisement Click Prediction Based on Long Short-Term Memory Networks [2.7959678888027906]
This paper explores an improved Adaboost algorithm based on Long Short-Term Memory Networks (LSTM)
By comparing it with several common machine learning algorithms, the paper analyses the advantages of the new model in ad click prediction.
It is shown that the improved algorithm proposed in this paper performs well in user ad click prediction with an accuracy of 92%.
arXiv Detail & Related papers (2024-08-08T03:27:02Z) - Exploiting CNNs for Semantic Segmentation with Pascal VOC [0.0]
We present a comprehensive study on semantic segmentation with the Pascal VOC dataset.
We firstly use a Fully Convolution Network (FCN) baseline which gave 71.31% pixel accuracy and 0.0527 mean IoU.
We analyze its performance and working and subsequently address the issues in the baseline with three improvements.
arXiv Detail & Related papers (2023-04-26T00:40:27Z) - Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness
with Dataset Reinforcement [68.44100784364987]
We propose a strategy to improve a dataset once such that the accuracy of any model architecture trained on the reinforced dataset is improved at no additional training cost for users.
We create a reinforced version of the ImageNet training dataset, called ImageNet+, as well as reinforced datasets CIFAR-100+, Flowers-102+, and Food-101+.
Models trained with ImageNet+ are more accurate, robust, and calibrated, and transfer well to downstream tasks.
arXiv Detail & Related papers (2023-03-15T23:10:17Z) - ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders [104.05133094625137]
We propose a fully convolutional masked autoencoder framework and a new Global Response Normalization layer.
This co-design of self-supervised learning techniques and architectural improvement results in a new model family called ConvNeXt V2, which significantly improves the performance of pure ConvNets.
arXiv Detail & Related papers (2023-01-02T18:59:31Z) - Improving Visual Grounding by Encouraging Consistent Gradient-based
Explanations [58.442103936918805]
We show that Attention Mask Consistency produces superior visual grounding results than previous methods.
AMC is effective, easy to implement, and is general as it can be adopted by any vision-language model.
arXiv Detail & Related papers (2022-06-30T17:55:12Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z) - With a Little Help from My Friends: Nearest-Neighbor Contrastive
Learning of Visual Representations [87.72779294717267]
Using the nearest-neighbor as positive in contrastive losses improves performance significantly on ImageNet classification.
We demonstrate empirically that our method is less reliant on complex data augmentations.
arXiv Detail & Related papers (2021-04-29T17:56:08Z) - Revisiting Batch Normalization for Improving Corruption Robustness [85.20742045853738]
We interpret corruption robustness as a domain shift and propose to rectify batch normalization statistics for improving model robustness.
We find that simply estimating and adapting the BN statistics on a few representation samples, without retraining the model, improves the corruption robustness by a large margin.
arXiv Detail & Related papers (2020-10-07T19:56:47Z) - Compounding the Performance Improvements of Assembled Techniques in a
Convolutional Neural Network [6.938261599173859]
We show how to improve the accuracy and robustness of basic CNN models.
Our proposed assembled ResNet-50 shows improvements in top-1 accuracy from 76.3% to 82.78%, mCE from 76.0% to 48.9% and mFR from 57.7% to 32.3%.
Our approach achieved 1st place in the iFood Competition Fine-Grained Visual Recognition at CVPR 2019.
arXiv Detail & Related papers (2020-01-17T12:42:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.