Rethinking Video with a Universal Event-Based Representation
- URL: http://arxiv.org/abs/2408.06248v1
- Date: Mon, 12 Aug 2024 16:00:17 GMT
- Title: Rethinking Video with a Universal Event-Based Representation
- Authors: Andrew Freeman,
- Abstract summary: I introduce Address, Decimation, DeltaER, a novel intermediate video representation and system framework.
I demonstrate that ADDeltaER achieves state-of-the-art application speed and compression performance for scenes with high temporal redundancy.
I discuss the implications for event-based video on large-scale video surveillance and resource-constrained sensing.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditionally, video is structured as a sequence of discrete image frames. Recently, however, a novel video sensing paradigm has emerged which eschews video frames entirely. These "event" sensors aim to mimic the human vision system with asynchronous sensing, where each pixel has an independent, sparse data stream. While these cameras enable high-speed and high-dynamic-range sensing, researchers often revert to a framed representation of the event data for existing applications, or build bespoke applications for a particular camera's event data type. At the same time, classical video systems have significant computational redundancy at the application layer, since pixel samples are repeated across frames in the uncompressed domain. To address the shortcomings of existing systems, I introduce Address, Decimation, {\Delta}t Event Representation (AD{\Delta}ER, pronounced "adder"), a novel intermediate video representation and system framework. The framework transcodes a variety of framed and event camera sources into a single event-based representation, which supports source-modeled lossy compression and backward compatibility with traditional frame-based applications. I demonstrate that AD{\Delta}ER achieves state-of-the-art application speed and compression performance for scenes with high temporal redundancy. Crucially, I describe how AD{\Delta}ER unlocks an entirely new control mechanism for computer vision: application speed can correlate with both the scene content and the level of lossy compression. Finally, I discuss the implications for event-based video on large-scale video surveillance and resource-constrained sensing.
Related papers
- An Open Software Suite for Event-Based Video [0.8158530638728501]
Event-based video is a new paradigm that forgoes image frames altogether.
Until now, researchers have lacked a cohesive software framework for exploring the representation, compression, and applications of event-based video.
I present the AD$Delta$ER software suite to fill this gap.
arXiv Detail & Related papers (2024-01-30T16:32:37Z) - EventAid: Benchmarking Event-aided Image/Video Enhancement Algorithms
with Real-captured Hybrid Dataset [55.12137324648253]
Event cameras are emerging imaging technology that offers advantages over conventional frame-based imaging sensors in dynamic range and sensing speed.
This paper focuses on five event-aided image and video enhancement tasks.
arXiv Detail & Related papers (2023-12-13T15:42:04Z) - Accelerated Event-Based Feature Detection and Compression for
Surveillance Video Systems [1.5390526524075634]
We propose a novel system which conveys temporal redundancy within a sparse decompressed representation.
We leverage a video representation framework called ADDER to transcode framed videos to sparse, asynchronous intensity samples.
Our work paves the way for upcoming neuromorphic sensors and is amenable to future applications with spiking neural networks.
arXiv Detail & Related papers (2023-12-13T15:30:29Z) - EventTransAct: A video transformer-based framework for Event-camera
based action recognition [52.537021302246664]
Event cameras offer new opportunities compared to standard action recognition in RGB videos.
In this study, we employ a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame.
In order to better adopt the VTN for the sparse and fine-grained nature of event data, we design Event-Contrastive Loss ($mathcalL_EC$) and event-specific augmentations.
arXiv Detail & Related papers (2023-08-25T23:51:07Z) - VNVC: A Versatile Neural Video Coding Framework for Efficient
Human-Machine Vision [59.632286735304156]
It is more efficient to enhance/analyze the coded representations directly without decoding them into pixels.
We propose a versatile neural video coding (VNVC) framework, which targets learning compact representations to support both reconstruction and direct enhancement/analysis.
arXiv Detail & Related papers (2023-06-19T03:04:57Z) - An Asynchronous Intensity Representation for Framed and Event Video
Sources [2.9097303137825046]
We introduce an intensity representation for both framed and non-framed data sources.
We show that our representation can increase intensity precision and greatly reduce the number of samples per pixel.
We argue that our method provides the computational efficiency and temporal granularity necessary to build real-time intensity-based applications for event cameras.
arXiv Detail & Related papers (2023-01-20T19:46:23Z) - Event-Based Frame Interpolation with Ad-hoc Deblurring [68.97825675372354]
We propose a general method for event-based frame that performs deblurring ad-hoc on input videos.
Our network consistently outperforms state-of-the-art methods on frame, single image deblurring and the joint task of deblurring.
Our code and dataset will be made publicly available.
arXiv Detail & Related papers (2023-01-12T18:19:00Z) - VideoINR: Learning Video Implicit Neural Representation for Continuous
Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate.
We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z) - How Asynchronous Events Encode Video [18.666472443354092]
Event-based cameras have sensors that emit events when their inputs change, thus encoding information in the timing of events.
This creates new challenges in establishing reconstruction guarantees and algorithms, but also provides advantages over frame-based video.
We consider the case of time encoding bandlimited video and demonstrate a dependence between spatial sensor density and overall spatial and temporal resolution.
arXiv Detail & Related papers (2022-06-09T08:36:21Z) - End-to-End Compressed Video Representation Learning for Generic Event
Boundary Detection [31.31508043234419]
We propose a new end-to-end compressed video representation learning for event boundary detection.
We first use the ConvNets to extract features of the I-frames in the GOPs.
After that, a light-weight spatial-channel compressed encoder is designed to compute the feature representations of the P-frames.
A temporal contrastive module is proposed to determine the event boundaries of video sequences.
arXiv Detail & Related papers (2022-03-29T08:27:48Z) - Video Imprint [107.1365846180187]
A new unified video analytics framework (ER3) is proposed for complex event retrieval, recognition and recounting.
The proposed video imprint representation exploits temporal correlations among image features across video frames.
The video imprint is fed into a reasoning network and a feature aggregation module, for event recognition/recounting and event retrieval tasks, respectively.
arXiv Detail & Related papers (2021-06-07T00:32:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.