Frequency-aware Event Cloud Network
- URL: http://arxiv.org/abs/2412.20803v1
- Date: Mon, 30 Dec 2024 08:53:57 GMT
- Title: Frequency-aware Event Cloud Network
- Authors: Hongwei Ren, Fei Ma, Xiaopeng Lin, Yuetong Fang, Hongxiang Huang, Yulong Huang, Yue Zhou, Haotian Fu, Ziyi Yang, Fei Richard Yu, Bojun Cheng,
- Abstract summary: We propose a frequency-aware network named FECNet that leverages Event Cloud representations.
FECNet fully utilizes 2S-1T-1P Event Cloud by innovating the event-based Group and Sampling module.
We conducted extensive experiments on event-based object classification, action recognition, and human pose estimation tasks.
- Score: 22.41905416371072
- License:
- Abstract: Event cameras are biologically inspired sensors that emit events asynchronously with remarkable temporal resolution, garnering significant attention from both industry and academia. Mainstream methods favor frame and voxel representations, which reach a satisfactory performance while introducing time-consuming transformation, bulky models, and sacrificing fine-grained temporal information. Alternatively, Point Cloud representation demonstrates promise in addressing the mentioned weaknesses, but it ignores the polarity information, and its models have limited proficiency in abstracting long-term events' features. In this paper, we propose a frequency-aware network named FECNet that leverages Event Cloud representations. FECNet fully utilizes 2S-1T-1P Event Cloud by innovating the event-based Group and Sampling module. To accommodate the long sequence events from Event Cloud, FECNet embraces feature extraction in the frequency domain via the Fourier transform. This approach substantially extinguishes the explosion of Multiply Accumulate Operations (MACs) while effectively abstracting spatial-temporal features. We conducted extensive experiments on event-based object classification, action recognition, and human pose estimation tasks, and the results substantiate the effectiveness and efficiency of FECNet.
Related papers
- FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation [50.9040167152168]
We experimentally quantify the contrast sensitivity function of CNNs and compare it with that of the human visual system.
We propose the Wavelet-Guided Spectral Pooling Module (WSPM) to enhance and balance image features across the frequency domain.
To further emulate the human visual system, we introduce the Frequency Domain Enhanced Receptive Field Block (FE-RFB)
We develop FE-UNet, a model that utilizes SAM2 as its backbone and incorporates Hiera-Large as a pre-trained block.
arXiv Detail & Related papers (2025-02-06T07:24:34Z) - Event-based Motion Deblurring via Multi-Temporal Granularity Fusion [5.58706910566768]
Event camera, a bio-inspired sensor offering continuous visual information could enhance the deblurring performance.
Existing event-based image deblurring methods usually utilize voxel-based event representations.
We introduce point cloud-based event representation into the image deblurring task and propose a Multi-Temporal Granularity Network (MTGNet)
It combines the spatially dense but temporally coarse-grained voxel-based event representation and the temporally fine-grained but spatially sparse point cloud-based event.
arXiv Detail & Related papers (2024-12-16T15:20:54Z) - EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond [61.10181853363728]
Event-based Action Recognition (EAR) possesses the advantages of high-temporal and privacy preservation compared with traditional action recognition.
We present EventCrab, a framework that adeptly integrates the "lighter" frame-specific networks for dense event frames with the "heavier" point-specific networks for sparse event points.
Experiments on four datasets demonstrate the significant performance of our proposed EventCrab.
arXiv Detail & Related papers (2024-11-27T13:28:57Z) - Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba [11.400397931501338]
Event cameras efficiently detect changes in ambient light with low latency and high dynamic range while consuming minimal power.
Most current approach to processing event data often involves converting it into frame-based representations.
Point Cloud is a popular representation for 3D processing and is better suited to match the sparse and asynchronous nature of the event camera.
We propose EventMamba, an efficient and effective Point Cloud framework that achieves competitive results even compared to the state-of-the-art (SOTA) frame-based method.
arXiv Detail & Related papers (2024-05-09T21:47:46Z) - MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking [50.26836546224782]
Event-based eye tracking has shown great promise with the high temporal resolution and low redundancy.
The diversity and abruptness of eye movement patterns, including blinking, fixating, saccades, and smooth pursuit, pose significant challenges for eye localization.
This paper proposes a bidirectional long-term sequence modeling and time-varying state selection mechanism to fully utilize contextual temporal information.
arXiv Detail & Related papers (2024-04-18T11:09:25Z) - Fast Window-Based Event Denoising with Spatiotemporal Correlation
Enhancement [85.66867277156089]
We propose window-based event denoising, which simultaneously deals with a stack of events.
In spatial domain, we choose maximum a posteriori (MAP) to discriminate real-world event and noise.
Our algorithm can remove event noise effectively and efficiently and improve the performance of downstream tasks.
arXiv Detail & Related papers (2024-02-14T15:56:42Z) - Representation Learning on Event Stream via an Elastic Net-incorporated
Tensor Network [1.9515859963221267]
We present a novel representation method which can capture global correlations of all events in the event stream simultaneously.
Our method can achieve effective results in applications like filtering noise compared with the state-of-the-art methods.
arXiv Detail & Related papers (2024-01-16T02:51:47Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Event Voxel Set Transformer for Spatiotemporal Representation Learning on Event Streams [19.957857885844838]
Event cameras are neuromorphic vision sensors that record a scene as sparse and asynchronous event streams.
We propose an attentionaware model named Event Voxel Set Transformer (EVSTr) for efficient representation learning on event streams.
Experiments show that EVSTr achieves state-of-the-art performance while maintaining low model complexity.
arXiv Detail & Related papers (2023-03-07T12:48:02Z) - Ev-TTA: Test-Time Adaptation for Event-Based Object Recognition [7.814941658661939]
Ev-TTA is a simple, effective test-time adaptation for event-based object recognition.
Our formulation can be successfully applied regardless of input representations and extended into regression tasks.
arXiv Detail & Related papers (2022-03-23T07:43:44Z) - A Prospective Study on Sequence-Driven Temporal Sampling and Ego-Motion
Compensation for Action Recognition in the EPIC-Kitchens Dataset [68.8204255655161]
Action recognition is one of the top-challenging research fields in computer vision.
ego-motion recorded sequences have become of important relevance.
The proposed method aims to cope with it by estimating this ego-motion or camera motion.
arXiv Detail & Related papers (2020-08-26T14:44:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.