Fourier Disentangled Space-Time Attention for Aerial Video Recognition
- URL: http://arxiv.org/abs/2203.10694v1
- Date: Mon, 21 Mar 2022 01:24:53 GMT
- Title: Fourier Disentangled Space-Time Attention for Aerial Video Recognition
- Authors: Divya Kothandaraman, Tianrui Guan, Xijun Wang, Sean Hu, Ming Lin,
Dinesh Manocha
- Abstract summary: We present an algorithm, Fourier Activity Recognition (FAR), for UAV video activity recognition.
Our formulation uses a novel Fourier object disentanglement method to innately separate out the human agent from the background.
We have evaluated our approach on multiple UAV datasets including UAV Human RGB, UAV Human Night, Drone Action, and NEC Drone.
- Score: 54.80846279175762
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an algorithm, Fourier Activity Recognition (FAR), for UAV video
activity recognition. Our formulation uses a novel Fourier object
disentanglement method to innately separate out the human agent (which is
typically small) from the background. Our disentanglement technique operates in
the frequency domain to characterize the extent of temporal change of spatial
pixels, and exploits convolution-multiplication properties of Fourier transform
to map this representation to the corresponding object-background entangled
features obtained from the network. To encapsulate contextual information and
long-range space-time dependencies, we present a novel Fourier Attention
algorithm, which emulates the benefits of self-attention by modeling the
weighted outer product in the frequency domain. Our Fourier attention
formulation uses much fewer computations than self-attention. We have evaluated
our approach on multiple UAV datasets including UAV Human RGB, UAV Human Night,
Drone Action, and NEC Drone. We demonstrate a relative improvement of 8.02% -
38.69% in top-1 accuracy and up to 3 times faster over prior works.
Related papers
- Neural Fourier Modelling: A Highly Compact Approach to Time-Series Analysis [9.969451740838418]
We introduce Neural Fourier Modelling (NFM), a compact yet powerful solution for time-series analysis.
NFM is grounded in two key properties of the Fourier transform (FT): (i) the ability to model finite-length time series as functions in the Fourier domain, and (ii) the capacity for data manipulation within the Fourier domain.
NFM achieves state-of-the-art performance on a wide range of tasks, including challenging time-series scenarios with previously unseen sampling rates at test time.
arXiv Detail & Related papers (2024-10-07T02:39:55Z) - Triple-domain Feature Learning with Frequency-aware Memory Enhancement for Moving Infrared Small Target Detection [12.641645684148136]
Infrared small target detection presents significant challenges due to target sizes and low contrast against backgrounds.
We propose a new Triple-domain Strategy (Tridos) with frequency-aware memory enhancement on-temporal domain for infrared small target detection.
Inspired by human visual system, our memory enhancement is designed to capture the spatial relations of infrared targets among video frames.
arXiv Detail & Related papers (2024-06-11T05:21:30Z) - Frequency-Aware Deepfake Detection: Improving Generalizability through
Frequency Space Learning [81.98675881423131]
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images.
Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries.
We introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors.
arXiv Detail & Related papers (2024-03-12T01:28:00Z) - Neural Fourier Filter Bank [18.52741992605852]
We present a novel method to provide efficient and highly detailed reconstructions.
Inspired by wavelets, we learn a neural field that decompose the signal both spatially and frequency-wise.
arXiv Detail & Related papers (2022-12-04T03:45:08Z) - Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time.
This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z) - Deep Fourier Up-Sampling [100.59885545206744]
Up-sampling in the Fourier domain is more challenging as it does not follow such a local property.
We propose a theoretically sound Deep Fourier Up-Sampling (FourierUp) to solve these issues.
arXiv Detail & Related papers (2022-10-11T06:17:31Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - Fourier Features Let Networks Learn High Frequency Functions in Low
Dimensional Domains [69.62456877209304]
We show that passing input points through a simple Fourier feature mapping enables a multilayer perceptron to learn high-frequency functions.
Results shed light on advances in computer vision and graphics that achieve state-of-the-art results.
arXiv Detail & Related papers (2020-06-18T17:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.