Related papers: Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras

Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras

URL: http://arxiv.org/abs/2410.06698v1
Date: Wed, 9 Oct 2024 09:06:37 GMT
Title: Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras
Authors: Friedhelm Hamann, Suman Ghosh, Ignacio Juarez Martinez, Tom Hart, Alex Kacelnik, Guillermo Gallego,
Abstract summary: We propose approaches to action recognition based on the Fourier Transform. In particular, we apply our approaches to a recent dataset of breeding penguins annotated for "ecstatic display" We find that our approaches are both simple and effective, producing slightly lower results than a deep neural network (DNN)
Score: 9.107129038623242
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Event cameras are novel bio-inspired vision sensors that measure pixel-wise brightness changes asynchronously instead of images at a given frame rate. They offer promising advantages, namely a high dynamic range, low latency, and minimal motion blur. Modern computer vision algorithms often rely on artificial neural network approaches, which require image-like representations of the data and cannot fully exploit the characteristics of event data. We propose approaches to action recognition based on the Fourier Transform. The approaches are intended to recognize oscillating motion patterns commonly present in nature. In particular, we apply our approaches to a recent dataset of breeding penguins annotated for "ecstatic display", a behavior where the observed penguins flap their wings at a certain frequency. We find that our approaches are both simple and effective, producing slightly lower results than a deep neural network (DNN) while relying just on a tiny fraction of the parameters compared to the DNN (five orders of magnitude fewer parameters). They work well despite the uncontrolled, diverse data present in the dataset. We hope this work opens a new perspective on event-based processing and action recognition.

Related papers

SpikMamba: When SNN meets Mamba in Event-based Human Action Recognition [13.426390494116776]
Human action recognition (HAR) plays a key role in various applications such as video analysis, surveillance, autonomous driving, robotics, and healthcare. Most HAR algorithms are developed from RGB images, which capture detailed visual information. Event cameras offer a promising solution by capturing scene brightness changes sparsely at the pixel level, without capturing full images.
arXiv Detail & Related papers (2024-10-22T07:00:43Z)
Learning Robust Multi-Scale Representation for Neural Radiance Fields from Unposed Images [65.41966114373373]
We present an improved solution to the neural image-based rendering problem in computer vision. The proposed approach could synthesize a realistic image of the scene from a novel viewpoint at test time.
arXiv Detail & Related papers (2023-11-08T08:18:23Z)
Neuromorphic Imaging and Classification with Graph Learning [11.882239213276392]
Bio-inspired neuromorphic cameras asynchronously record pixel brightness changes and generate sparse event streams. Due to the multidimensional address-event structure, most existing vision algorithms cannot properly handle asynchronous event streams. We propose a new graph representation of the event data and couple it with a Graph Transformer to perform accurate neuromorphic classification.
arXiv Detail & Related papers (2023-09-27T12:58:18Z)
EventTransAct: A video transformer-based framework for Event-camera based action recognition [52.537021302246664]
Event cameras offer new opportunities compared to standard action recognition in RGB videos. In this study, we employ a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame. In order to better adopt the VTN for the sparse and fine-grained nature of event data, we design Event-Contrastive Loss ($mathcalL_EC$) and event-specific augmentations.
arXiv Detail & Related papers (2023-08-25T23:51:07Z)
Inverting the Imaging Process by Learning an Implicit Camera Model [73.81635386829846]
This paper proposes a novel implicit camera model which represents the physical imaging process of a camera as a deep neural network. We demonstrate the power of this new implicit camera model on two inverse imaging tasks.
arXiv Detail & Related papers (2023-04-25T11:55:03Z)
NEWTON: Neural View-Centric Mapping for On-the-Fly Large-Scale SLAM [51.21564182169607]
Newton is a view-centric mapping method that dynamically constructs neural fields based on run-time observation. Our method enables camera pose updates using loop closures and scene boundary updates by representing the scene with multiple neural fields. The experimental results demonstrate the superior performance of our method over existing world-centric neural field-based SLAM systems.
arXiv Detail & Related papers (2023-03-23T20:22:01Z)
Highly Efficient 3D Human Pose Tracking from Events with Spiking Spatiotemporal Transformer [23.15179173446486]
We introduce the first sparse Spiking Neural Networks (SNNs) framework for 3D human pose tracking based solely on events.<n>Our approach eliminates the need to convert sparse data to dense formats or incorporate additional images, thereby fully exploiting the innate sparsity of input events.<n> Empirical experiments demonstrate the superiority of our approach over existing state-of-the-art (SOTA) ANN-based methods, requiring only 19.1% FLOPs and 3.6% cost energy.
arXiv Detail & Related papers (2023-03-16T22:56:12Z)
Optical flow estimation from event-based cameras and spiking neural networks [0.4899818550820575]
Event-based sensors are an excellent fit for Spiking Neural Networks (SNNs) We propose a U-Net-like SNN which, after supervised training, is able to make dense optical flow estimations. Thanks to separable convolutions, we have been able to develop a light model that can nonetheless yield reasonably accurate optical flow estimates.
arXiv Detail & Related papers (2023-02-13T16:17:54Z)
Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering [34.80975358673563]
We propose a novel approach that learns generalizable neural radiance fields based on a parametric human body model for robust performance capture. Experiments on the ZJU-MoCap and AIST datasets show that our method significantly outperforms recent generalizable NeRF methods on unseen identities and poses.
arXiv Detail & Related papers (2021-09-15T17:32:46Z)
Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames. Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction. We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z)
Self-Supervised Linear Motion Deblurring [112.75317069916579]
Deep convolutional neural networks are state-of-the-art for image deblurring. We present a differentiable reblur model for self-supervised motion deblurring. Our experiments demonstrate that self-supervised single image deblurring is really feasible.
arXiv Detail & Related papers (2020-02-10T20:15:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.