Tidying Deep Saliency Prediction Architectures
- URL: http://arxiv.org/abs/2003.04942v1
- Date: Tue, 10 Mar 2020 19:34:49 GMT
- Title: Tidying Deep Saliency Prediction Architectures
- Authors: Navyasri Reddy, Samyak Jain, Pradeep Yarlagadda, Vineet Gandhi
- Abstract summary: In this paper, we identify four key components of saliency models, i.e., input features, multi-level integration, readout architecture, and loss functions.
We propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks.
- Score: 6.613005108411055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning computational models for visual attention (saliency estimation) is
an effort to inch machines/robots closer to human visual cognitive abilities.
Data-driven efforts have dominated the landscape since the introduction of deep
neural network architectures. In deep learning research, the choices in
architecture design are often empirical and frequently lead to more complex
models than necessary. The complexity, in turn, hinders the application
requirements. In this paper, we identify four key components of saliency
models, i.e., input features, multi-level integration, readout architecture,
and loss functions. We review the existing state of the art models on these
four components and propose novel and simpler alternatives. As a result, we
propose two novel end-to-end architectures called SimpleNet and MDNSal, which
are neater, minimal, more interpretable and achieve state of the art
performance on public saliency benchmarks. SimpleNet is an optimized
encoder-decoder architecture and brings notable performance gains on the
SALICON dataset (the largest saliency benchmark). MDNSal is a parametric model
that directly predicts parameters of a GMM distribution and is aimed to bring
more interpretability to the prediction maps. The proposed saliency models can
be inferred at 25fps, making them suitable for real-time applications. Code and
pre-trained models are available at https://github.com/samyak0210/saliency.
Related papers
- Neural Architecture Codesign for Fast Physics Applications [0.8692847090818803]
We develop a pipeline to streamline neural architecture codesign for physics applications.
We employ neural architecture search and network compression in a two-stage approach to discover hardware efficient models.
arXiv Detail & Related papers (2025-01-09T19:00:03Z) - Adaptable Embeddings Network (AEN) [49.1574468325115]
We introduce Adaptable Embeddings Networks (AEN), a novel dual-encoder architecture using Kernel Density Estimation (KDE)
AEN allows for runtime adaptation of classification criteria without retraining and is non-autoregressive.
The architecture's ability to preprocess and cache condition embeddings makes it ideal for edge computing applications and real-time monitoring systems.
arXiv Detail & Related papers (2024-11-21T02:15:52Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Computer Vision Model Compression Techniques for Embedded Systems: A Survey [75.38606213726906]
This paper covers the main model compression techniques applied for computer vision tasks.
We present the characteristics of compression subareas, compare different approaches, and discuss how to choose the best technique.
We also share codes to assist researchers and new practitioners in overcoming initial implementation challenges.
arXiv Detail & Related papers (2024-08-15T16:41:55Z) - NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of very large neural networks.
NNsight is an open-source system that extends PyTorch to introduce deferred remote execution.
NDIF is a scalable inference service that executes NNsight requests, allowing users to share GPU resources and pretrained models.
arXiv Detail & Related papers (2024-07-18T17:59:01Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - FlowNAS: Neural Architecture Search for Optical Flow Estimation [65.44079917247369]
We propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task.
Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67% F1-all error on KITTI.
arXiv Detail & Related papers (2022-07-04T09:05:25Z) - Convolution Neural Network Hyperparameter Optimization Using Simplified
Swarm Optimization [2.322689362836168]
Convolutional Neural Network (CNN) is widely used in computer vision.
It is not easy to find a network architecture with better performance.
arXiv Detail & Related papers (2021-03-06T00:23:27Z) - A Compact Deep Architecture for Real-time Saliency Prediction [42.58396452892243]
Saliency models aim to imitate the attention mechanism in the human visual system.
Deep models have a high number of parameters which makes them less suitable for real-time applications.
Here we propose a compact yet fast model for real-time saliency prediction.
arXiv Detail & Related papers (2020-08-30T17:47:16Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Computation on Sparse Neural Networks: an Inspiration for Future
Hardware [20.131626638342706]
We describe the current status of the research on the computation of sparse neural networks.
We discuss the model accuracy influenced by the number of weight parameters and the structure of the model.
We show that for practically complicated problems, it is more beneficial to search large and sparse models in the weight dominated region.
arXiv Detail & Related papers (2020-04-24T19:13:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.