adder-viz: Real-Time Visualization Software for Transcoding Event Video
- URL: http://arxiv.org/abs/2508.14996v2
- Date: Mon, 25 Aug 2025 19:58:20 GMT
- Title: adder-viz: Real-Time Visualization Software for Transcoding Event Video
- Authors: Andrew C. Freeman, Luke Reinkensmeyer,
- Abstract summary: Event video eschews video frames in favor of asynchronous, per-pixel intensity samples.<n>We previously proposed the unified ADDER representation to address these concerns.<n>This paper introduces numerous improvements to the adder-viz software for visualizing real-time event transcode processes and applications in-the-loop.
- Score: 0.21485350418225238
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years have brought about a surge in neuromorphic ``event'' video research, primarily targeting computer vision applications. Event video eschews video frames in favor of asynchronous, per-pixel intensity samples. While much work has focused on a handful of representations for specific event cameras, these representations have shown limitations in flexibility, speed, and compressibility. We previously proposed the unified ADDER representation to address these concerns. This paper introduces numerous improvements to the adder-viz software for visualizing real-time event transcode processes and applications in-the-loop. The MIT-licensed software is available from a centralized repository at https://github.com/ac-freeman/adder-codec-rs.
Related papers
- CoPE-VideoLM: Codec Primitives For Efficient Video Language Models [56.76440182038839]
Video Language Models (VideoLMs) empower AI systems to understand temporal dynamics in videos.<n>Current methods use sampling which can miss both macro-level events and micro-level details due to the sparse temporal coverage.<n>We propose to leverage video primitives which encode video redundancy and sparsity without requiring expensive full-image encoding for most frames.
arXiv Detail & Related papers (2026-02-13T18:57:31Z) - Scalable Event-Based Video Streaming for Machines with MoQ [0.8158530638728501]
A new class of neuromorphic event'' sensors records video with asynchronous pixel samples rather than image frames.<n>We propose a new low-latency event streaming format based on the latest additions to the Media Over QUIC protocol draft.
arXiv Detail & Related papers (2025-08-20T18:44:10Z) - STORM: Token-Efficient Long Video Understanding for Multimodal LLMs [101.70681093383365]
STORM is a novel architecture incorporating a dedicated temporal encoder between the image encoder and the Video-LLMs.<n>We show that STORM achieves state-of-the-art results across various long video understanding benchmarks.
arXiv Detail & Related papers (2025-03-06T06:17:38Z) - Mind the Time: Temporally-Controlled Multi-Event Video Generation [65.05423863685866]
We present MinT, a multi-event video generator with temporal control.<n>Our key insight is to bind each event to a specific period in the generated video, which allows the model to focus on one event at a time.<n>For the first time in the literature, our model offers control over the timing of events in generated videos.
arXiv Detail & Related papers (2024-12-06T18:52:20Z) - Rethinking Video with a Universal Event-Based Representation [0.0]
I introduce Address, Decimation, DeltaER, a novel intermediate video representation and system framework.
I demonstrate that ADDeltaER achieves state-of-the-art application speed and compression performance for scenes with high temporal redundancy.
I discuss the implications for event-based video on large-scale video surveillance and resource-constrained sensing.
arXiv Detail & Related papers (2024-08-12T16:00:17Z) - Event-aware Video Corpus Moment Retrieval [79.48249428428802]
Video Corpus Moment Retrieval (VCMR) is a practical video retrieval task focused on identifying a specific moment within a vast corpus of untrimmed videos.
Existing methods for VCMR typically rely on frame-aware video retrieval, calculating similarities between the query and video frames to rank videos.
We propose EventFormer, a model that explicitly utilizes events within videos as fundamental units for video retrieval.
arXiv Detail & Related papers (2024-02-21T06:55:20Z) - Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization [52.63845811751936]
Video pre-training is challenging due to the modeling of its dynamics video.
In this paper, we address such limitations in video pre-training with an efficient video decomposition.
Our framework is both capable of comprehending and generating image and video content, as demonstrated by its performance across 13 multimodal benchmarks.
arXiv Detail & Related papers (2024-02-05T16:30:49Z) - An Open Software Suite for Event-Based Video [0.8158530638728501]
Event-based video is a new paradigm that forgoes image frames altogether.
Until now, researchers have lacked a cohesive software framework for exploring the representation, compression, and applications of event-based video.
I present the AD$Delta$ER software suite to fill this gap.
arXiv Detail & Related papers (2024-01-30T16:32:37Z) - Accelerated Event-Based Feature Detection and Compression for
Surveillance Video Systems [1.5390526524075634]
We propose a novel system which conveys temporal redundancy within a sparse decompressed representation.
We leverage a video representation framework called ADDER to transcode framed videos to sparse, asynchronous intensity samples.
Our work paves the way for upcoming neuromorphic sensors and is amenable to future applications with spiking neural networks.
arXiv Detail & Related papers (2023-12-13T15:30:29Z) - VideoINR: Learning Video Implicit Neural Representation for Continuous
Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate.
We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z) - End-to-End Compressed Video Representation Learning for Generic Event
Boundary Detection [31.31508043234419]
We propose a new end-to-end compressed video representation learning for event boundary detection.
We first use the ConvNets to extract features of the I-frames in the GOPs.
After that, a light-weight spatial-channel compressed encoder is designed to compute the feature representations of the P-frames.
A temporal contrastive module is proposed to determine the event boundaries of video sequences.
arXiv Detail & Related papers (2022-03-29T08:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.