Related papers: MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving Camera Videos

MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving Camera Videos

URL: http://arxiv.org/abs/2210.09887v5
Date: Mon, 14 Aug 2023 20:24:24 GMT
Title: MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving Camera Videos
Authors: Mathias Parger, Chengcheng Tang, Thomas Neff, Christopher D. Twigg, Cem Keskin, Robert Wang, Markus Steinberger
Abstract summary: MotionDeltaCNN is a sparse CNN inference framework that supports moving cameras. We introduce spherical buffers and padded convolutions to enable seamless fusion of newly unveiled regions and previously processed regions. Our evaluation shows that we outperform DeltaCNN by up to 90% for moving camera videos.
Score: 16.865802182250857
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Convolutional neural network inference on video input is computationally expensive and requires high memory bandwidth. Recently, DeltaCNN managed to reduce the cost by only processing pixels with significant updates over the previous frame. However, DeltaCNN relies on static camera input. Moving cameras add new challenges in how to fuse newly unveiled image regions with already processed regions efficiently to minimize the update rate - without increasing memory overhead and without knowing the camera extrinsics of future frames. In this work, we propose MotionDeltaCNN, a sparse CNN inference framework that supports moving cameras. We introduce spherical buffers and padded convolutions to enable seamless fusion of newly unveiled regions and previously processed regions -- without increasing memory footprint. Our evaluation shows that we outperform DeltaCNN by up to 90% for moving camera videos.

Related papers

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation [70.17681136234202]
We reexamine the design distinctions and test the limits of what a sparse CNN can achieve. We propose two key components, i.e., adaptive receptive fields (spatially) and adaptive relation, to bridge the gap. This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module.
arXiv Detail & Related papers (2024-03-21T14:06:38Z)
PadChannel: Improving CNN Performance through Explicit Padding Encoding [40.39759037668144]
In convolutional neural networks (CNNs), padding plays a pivotal role in preserving spatial dimensions throughout the layers. Traditional padding techniques do not explicitly distinguish between the actual image content and the padded regions. We propose PadChannel, a novel padding method that encodes padding statuses as an additional input channel.
arXiv Detail & Related papers (2023-11-13T07:44:56Z)
EvConv: Fast CNN Inference on Event Camera Inputs For High-Speed Robot Perception [1.3869227429939426]
Event cameras capture visual information with a high temporal resolution and a wide dynamic range. Current convolutional neural network inference on event camera streams cannot currently perform real-time inference at the high speeds at which event cameras operate. This paper presents EvConv, a new approach to enable fast inference on CNNs for inputs from event cameras.
arXiv Detail & Related papers (2023-03-08T15:47:13Z)
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions [95.94629864981091]
This work presents a new large-scale CNN-based foundation model, termed InternImage, which can obtain the gain from increasing parameters and training data like ViTs. The proposed InternImage reduces the strict inductive bias of traditional CNNs and makes it possible to learn stronger and more robust patterns with large-scale parameters from massive data like ViTs.
arXiv Detail & Related papers (2022-11-10T18:59:04Z)
DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos [16.644938608211202]
Convolutional neural network inference on video data requires powerful hardware for real-time processing. We present a sparse convolutional neural network framework that enables sparse frame-by-frame updates. We are the first to significantly outperform the dense reference, cuDNN, in practical settings, achieving speedups of up to 7x with only marginal differences in accuracy.
arXiv Detail & Related papers (2022-03-08T10:54:00Z)
Continual 3D Convolutional Neural Networks for Real-time Processing of Videos [93.73198973454944]
We introduce Continual 3D Contemporalal Neural Networks (Co3D CNNs) Co3D CNNs process videos frame-by-frame rather than by clip by clip. We show that Co3D CNNs initialised on the weights from preexisting state-of-the-art video recognition models reduce floating point operations for frame-wise computations by 10.0-12.4x while improving accuracy on Kinetics-400 by 2.3-3.8x.
arXiv Detail & Related papers (2021-05-31T18:30:52Z)
MoViNets: Mobile Video Networks for Efficient Video Recognition [52.49314494202433]
3D convolutional neural networks (CNNs) are accurate at video recognition but require large computation and memory budgets. We propose a three-step approach to improve computational efficiency while substantially reducing the peak memory usage of 3D CNNs.
arXiv Detail & Related papers (2021-03-21T23:06:38Z)
Reducing the Sim-to-Real Gap for Event Cameras [64.89183456212069]
Event cameras are paradigm-shifting novel sensors that report asynchronous, per-pixel brightness changes called 'events' with unparalleled low latency. Recent work has demonstrated impressive results using Convolutional Neural Networks (CNNs) for video reconstruction and optic flow with events. We present strategies for improving training data for event based CNNs that result in 20-40% boost in performance of existing video reconstruction networks.
arXiv Detail & Related papers (2020-03-20T02:44:29Z)
Event-Based Angular Velocity Regression with Spiking Networks [51.145071093099396]
Spiking Neural Networks (SNNs) process information conveyed as temporal spikes rather than numeric values. We propose, for the first time, a temporal regression problem of numerical values given events from an event camera. We show that we can successfully train an SNN to perform angular velocity regression.
arXiv Detail & Related papers (2020-03-05T17:37:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.