Related papers: Scalable Event-Based Video Streaming for Machines with MoQ

Scalable Event-Based Video Streaming for Machines with MoQ

URL: http://arxiv.org/abs/2508.15003v1
Date: Wed, 20 Aug 2025 18:44:10 GMT
Title: Scalable Event-Based Video Streaming for Machines with MoQ
Authors: Andrew C. Freeman,
Abstract summary: A new class of neuromorphic event'' sensors records video with asynchronous pixel samples rather than image frames.<n>We propose a new low-latency event streaming format based on the latest additions to the Media Over QUIC protocol draft.
Score: 0.8158530638728501
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Lossy compression and rate-adaptive streaming are a mainstay in traditional video steams. However, a new class of neuromorphic ``event'' sensors records video with asynchronous pixel samples rather than image frames. These sensors are designed for computer vision applications, rather than human video consumption. Until now, researchers have focused their efforts primarily on application development, ignoring the crucial problem of data transmission. We survey the landscape of event-based video systems, discuss the technical issues with our recent scalable event streaming work, and propose a new low-latency event streaming format based on the latest additions to the Media Over QUIC protocol draft.

Related papers

UniE2F: A Unified Diffusion Framework for Event-to-Frame Reconstruction with Video Foundation Models [67.24086328473437]
Event cameras excel at recording relative intensity changes rather than absolute intensity.<n>The resulting data streams suffer from a significant loss of spatial information and static texture details.<n>We address this limitation by leveraging a pre-trained video diffusion model to reconstruct high-fidelity video frames from sparse event data.
arXiv Detail & Related papers (2026-02-22T14:06:49Z)
A Preprocessing Framework for Video Machine Vision under Compression [26.253209831074184]
We propose a video preprocessing framework tailored for machine vision tasks to address this challenge.<n>The proposed method incorporates a neural preprocessor which retaining crucial information for subsequent tasks, resulting in the boosting of rate-accuracy performance.
arXiv Detail & Related papers (2025-12-17T11:26:19Z)
adder-viz: Real-Time Visualization Software for Transcoding Event Video [0.21485350418225238]
Event video eschews video frames in favor of asynchronous, per-pixel intensity samples.<n>We previously proposed the unified ADDER representation to address these concerns.<n>This paper introduces numerous improvements to the adder-viz software for visualizing real-time event transcode processes and applications in-the-loop.
arXiv Detail & Related papers (2025-08-20T18:33:07Z)
Embedding Compression Distortion in Video Coding for Machines [67.97469042910855]
Currently, video transmission serves not only the Human Visual System (HVS) for viewing but also machine perception for analysis.<n>We propose a Compression Distortion Embedding (CDRE) framework, which extracts machine-perception-related distortion representation and embeds it into downstream models.<n>Our framework can effectively boost the rate-task performance of existing codecs with minimal overhead in terms of execution time, and number of parameters.
arXiv Detail & Related papers (2025-03-27T13:01:53Z)
EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation [58.41979933166173]
EvAnimate is the first method leveraging event streams as robust and precise motion cues for conditional human image animation.<n>High-quality and temporally coherent animations are achieved through a dual-branch architecture.<n>Experiment results show EvAnimate achieves high temporal fidelity and robust performance in scenarios where traditional video-derived cues fall short.
arXiv Detail & Related papers (2025-03-24T11:05:41Z)
StreamChat: Chatting with Streaming Video [85.02875830683637]
StreamChat is a novel approach that enhances the interaction capabilities of Large Multimodal Models with streaming video content.<n>We introduce a flexible and efficient crossattention-based architecture to process dynamic streaming inputs.<n>We construct a new dense instruction dataset to facilitate the training of streaming interaction models.
arXiv Detail & Related papers (2024-12-11T18:59:54Z)
Low-Latency Scalable Streaming for Event-Based Vision [0.5242869847419834]
We propose a scalable streaming method for event-based data based on Media Over QUIC.<n>We show that a state-of-the-art object detection application is resilient to dramatic data loss.<n>We observe an average reduction in detection mAP as low as 0.36.
arXiv Detail & Related papers (2024-12-10T19:48:57Z)
Rethinking Video with a Universal Event-Based Representation [0.0]
I introduce Address, Decimation, DeltaER, a novel intermediate video representation and system framework. I demonstrate that ADDeltaER achieves state-of-the-art application speed and compression performance for scenes with high temporal redundancy. I discuss the implications for event-based video on large-scale video surveillance and resource-constrained sensing.
arXiv Detail & Related papers (2024-08-12T16:00:17Z)
E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning [53.63364311738552]
Bio-inspired event cameras or dynamic vision sensors are capable of capturing per-pixel brightness changes (called event-streams) in high temporal resolution and high dynamic range. It calls for events-to-video (E2V) solutions which take event-streams as input and generate high quality video frames for intuitive visualization. We propose textbfE2HQV, a novel E2V paradigm designed to produce high-quality video frames from events.
arXiv Detail & Related papers (2024-01-16T05:10:50Z)
Accelerated Event-Based Feature Detection and Compression for Surveillance Video Systems [1.5390526524075634]
We propose a novel system which conveys temporal redundancy within a sparse decompressed representation. We leverage a video representation framework called ADDER to transcode framed videos to sparse, asynchronous intensity samples. Our work paves the way for upcoming neuromorphic sensors and is amenable to future applications with spiking neural networks.
arXiv Detail & Related papers (2023-12-13T15:30:29Z)
Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos. We show that this improves restoration accuracy compared to prior compression correction methods. We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z)
EventHands: Real-Time Neural 3D Hand Reconstruction from an Event Stream [80.15360180192175]
3D hand pose estimation from monocular videos is a long-standing and challenging problem. We address it for the first time using a single event camera, i.e., an asynchronous vision sensor reacting on brightness changes. Our approach has characteristics previously not demonstrated with a single RGB or depth camera.
arXiv Detail & Related papers (2020-12-11T16:45:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.