RecSal : Deep Recursive Supervision for Visual Saliency Prediction
- URL: http://arxiv.org/abs/2008.13745v1
- Date: Mon, 31 Aug 2020 17:08:34 GMT
- Title: RecSal : Deep Recursive Supervision for Visual Saliency Prediction
- Authors: Sandeep Mishra, Oindrila Saha
- Abstract summary: Saliency prediction datasets can be used to create more information for each stimulus than just a final aggregate saliency map.
We show that our best method outperforms previous state-of-the-art methods with 50-80% fewer parameters.
- Score: 2.223733768286313
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art saliency prediction methods develop upon model architectures
or loss functions; while training to generate one target saliency map. However,
publicly available saliency prediction datasets can be utilized to create more
information for each stimulus than just a final aggregate saliency map. This
information when utilized in a biologically inspired fashion can contribute in
better prediction performance without the use of models with huge number of
parameters. In this light, we propose to extract and use the statistics of (a)
region specific saliency and (b) temporal order of fixations, to provide
additional context to our network. We show that extra supervision using
spatially or temporally sequenced fixations results in achieving better
performance in saliency prediction. Further, we also design novel architectures
for utilizing this extra information and show that it achieves superior
performance over a base model which is devoid of extra supervision. We show
that our best method outperforms previous state-of-the-art methods with 50-80%
fewer parameters. We also show that our models perform consistently well across
all evaluation metrics unlike prior methods.
Related papers
- Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales [10.397502254316645]
We propose a two-phase scheme to ensure double-correct predictions.
First, we curate a new dataset that offers structured rationales for visual recognition tasks.
Second, we propose a rationale-informed optimization method to guide the model in disentangling and localizing visual evidence.
arXiv Detail & Related papers (2024-10-31T18:33:39Z) - A positive feedback method based on F-measure value for Salient Object
Detection [1.9249287163937976]
This paper proposes a positive feedback method based on F-measure value for salient object detection (SOD)
Our proposed method takes an image to be detected and inputs it into several existing models to obtain their respective prediction maps.
Experimental results on five publicly available datasets show that our proposed positive feedback method outperforms the latest 12 methods in five evaluation metrics for saliency map prediction.
arXiv Detail & Related papers (2023-04-28T04:05:13Z) - TempSAL -- Uncovering Temporal Information for Deep Saliency Prediction [64.63645677568384]
We introduce a novel saliency prediction model that learns to output saliency maps in sequential time intervals.
Our approach locally modulates the saliency predictions by combining the learned temporal maps.
Our code will be publicly available on GitHub.
arXiv Detail & Related papers (2023-01-05T22:10:16Z) - LOPR: Latent Occupancy PRediction using Generative Models [49.15687400958916]
LiDAR generated occupancy grid maps (L-OGMs) offer a robust bird's eye-view scene representation.
We propose a framework that decouples occupancy prediction into: representation learning and prediction within the learned latent space.
arXiv Detail & Related papers (2022-10-03T22:04:00Z) - Improved Fine-tuning by Leveraging Pre-training Data: Theory and
Practice [52.11183787786718]
Fine-tuning a pre-trained model on the target data is widely used in many deep learning applications.
Recent studies have empirically shown that training from scratch has the final performance that is no worse than this pre-training strategy.
We propose a novel selection strategy to select a subset from pre-training data to help improve the generalization on the target task.
arXiv Detail & Related papers (2021-11-24T06:18:32Z) - Confidence Adaptive Anytime Pixel-Level Recognition [86.75784498879354]
Anytime inference requires a model to make a progression of predictions which might be halted at any time.
We propose the first unified and end-to-end model approach for anytime pixel-level recognition.
arXiv Detail & Related papers (2021-04-01T20:01:57Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z) - Better Fine-Tuning by Reducing Representational Collapse [77.44854918334232]
Existing approaches for fine-tuning pre-trained language models have been shown to be unstable.
We present a method rooted in trust region theory that replaces previously used adversarial objectives with parametric noise.
We show it is less prone to representation collapse; the pre-trained models maintain more generalizable representations every time they are fine-tuned.
arXiv Detail & Related papers (2020-08-06T02:13:16Z) - SmaAt-UNet: Precipitation Nowcasting using a Small Attention-UNet
Architecture [5.28539620288341]
We show that it is possible to produce an accurate precipitation nowcast using a data-driven neural network approach.
We evaluate our approaches on a real-life datasets using precipitation maps from the region of the Netherlands and binary images of cloud coverage of France.
arXiv Detail & Related papers (2020-07-08T20:33:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.