Utilising Visual Attention Cues for Vehicle Detection and Tracking
- URL: http://arxiv.org/abs/2008.00106v1
- Date: Fri, 31 Jul 2020 23:00:13 GMT
- Title: Utilising Visual Attention Cues for Vehicle Detection and Tracking
- Authors: Feiyan Hu, Venkatesh G M, Noel E. O'Connor, Alan F. Smeaton and
Suzanne Little
- Abstract summary: We explore possible ways to use visual attention (saliency) for object detection and tracking.
We propose a neural network that can simultaneously detect objects as and generate objectness and subjectness maps to save computational power.
The experiments are conducted on KITTI and DETRAC datasets.
- Score: 13.2351348789193
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advanced Driver-Assistance Systems (ADAS) have been attracting attention from
many researchers. Vision-based sensors are the closest way to emulate human
driver visual behavior while driving. In this paper, we explore possible ways
to use visual attention (saliency) for object detection and tracking. We
investigate: 1) How a visual attention map such as a \emph{subjectness}
attention or saliency map and an \emph{objectness} attention map can facilitate
region proposal generation in a 2-stage object detector; 2) How a visual
attention map can be used for tracking multiple objects. We propose a neural
network that can simultaneously detect objects as and generate objectness and
subjectness maps to save computational power. We further exploit the visual
attention map during tracking using a sequential Monte Carlo probability
hypothesis density (PHD) filter. The experiments are conducted on KITTI and
DETRAC datasets. The use of visual attention and hierarchical features has
shown a considerable improvement of $\approx$8\% in object detection which
effectively increased tracking performance by $\approx$4\% on KITTI dataset.
Related papers
- OAT: Object-Level Attention Transformer for Gaze Scanpath Prediction [0.2796197251957245]
This paper introduces the Object-level Attention Transformer (OAT)
OAT predicts human scanpaths as they search for a target object within a cluttered scene of distractors.
We evaluate OAT on the Amazon book cover dataset and a new dataset for visual search that we collected.
arXiv Detail & Related papers (2024-07-18T09:33:17Z) - DisPlacing Objects: Improving Dynamic Vehicle Detection via Visual Place
Recognition under Adverse Conditions [29.828201168816243]
We investigate whether a prior map can be leveraged to aid in the detection of dynamic objects in a scene without the need for a 3D map.
We contribute an algorithm which refines an initial set of candidate object detections and produces a refined subset of highly accurate detections using a prior map.
arXiv Detail & Related papers (2023-06-30T10:46:51Z) - SalienDet: A Saliency-based Feature Enhancement Algorithm for Object
Detection for Autonomous Driving [160.57870373052577]
We propose a saliency-based OD algorithm (SalienDet) to detect unknown objects.
Our SalienDet utilizes a saliency-based algorithm to enhance image features for object proposal generation.
We design a dataset relabeling approach to differentiate the unknown objects from all objects in training sample set to achieve Open-World Detection.
arXiv Detail & Related papers (2023-05-11T16:19:44Z) - Track without Appearance: Learn Box and Tracklet Embedding with Local
and Global Motion Patterns for Vehicle Tracking [45.524183249765244]
Vehicle tracking is an essential task in the multi-object tracking (MOT) field.
In this paper, we try to explore the significance of motion patterns for vehicle tracking without appearance information.
We propose a novel approach that tackles the association issue for long-term tracking with the exclusive fully-exploited motion information.
arXiv Detail & Related papers (2021-08-13T02:27:09Z) - Tracking by Joint Local and Global Search: A Target-aware Attention
based Approach [63.50045332644818]
We propose a novel target-aware attention mechanism (termed TANet) to conduct joint local and global search for robust tracking.
Specifically, we extract the features of target object patch and continuous video frames, then we track and feed them into a decoder network to generate target-aware global attention maps.
In the tracking procedure, we integrate the target-aware attention with multiple trackers by exploring candidate search regions for robust tracking.
arXiv Detail & Related papers (2021-06-09T06:54:15Z) - Detecting Invisible People [58.49425715635312]
We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
arXiv Detail & Related papers (2020-12-15T16:54:45Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - MVLidarNet: Real-Time Multi-Class Scene Understanding for Autonomous
Driving Using Multiple Views [60.538802124885414]
We present Multi-View LidarNet (MVLidarNet), a two-stage deep neural network for multi-class object detection and drivable space segmentation.
MVLidarNet is able to detect and classify objects while simultaneously determining the drivable space using a single LiDAR scan as input.
We show results on both KITTI and a much larger internal dataset, thus demonstrating the method's ability to scale by an order of magnitude.
arXiv Detail & Related papers (2020-06-09T21:28:17Z) - Robust Visual Object Tracking with Two-Stream Residual Convolutional
Networks [62.836429958476735]
We propose a Two-Stream Residual Convolutional Network (TS-RCN) for visual tracking.
Our TS-RCN can be integrated with existing deep learning based visual trackers.
To further improve the tracking performance, we adopt a "wider" residual network ResNeXt as its feature extraction backbone.
arXiv Detail & Related papers (2020-05-13T19:05:42Z) - SpotNet: Self-Attention Multi-Task Network for Object Detection [11.444576186559487]
We produce foreground/background segmentation labels in a semi-supervised way, using background subtraction or optical flow.
We use those segmentation maps inside the network as a self-attention mechanism to weight the feature map used to produce the bounding boxes.
We show that by using this method, we obtain a significant mAP improvement on two traffic surveillance datasets.
arXiv Detail & Related papers (2020-02-13T14:43:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.