Efficient data-driven encoding of scene motion using Eccentricity
- URL: http://arxiv.org/abs/2103.02743v1
- Date: Wed, 3 Mar 2021 23:11:21 GMT
- Title: Efficient data-driven encoding of scene motion using Eccentricity
- Authors: Bruno Costa, Enrique Corona, Mostafa Parchami, Gint Puskorius, Dimitar
Filev
- Abstract summary: This paper presents a novel approach of representing dynamic visual scenes with static maps generated from video/image streams.
The maps are 2D matrices calculated in a pixel-wise manner, that is based on the concept of Eccentricity data analysis.
The list of potential applications includes video-based activity recognition, intent recognition, object tracking, video description.
- Score: 0.993963191737888
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper presents a novel approach of representing dynamic visual scenes
with static maps generated from video/image streams. Such representation allows
easy visual assessment of motion in dynamic environments. These maps are 2D
matrices calculated recursively, in a pixel-wise manner, that is based on the
recently introduced concept of Eccentricity data analysis. Eccentricity works
as a metric of a discrepancy between a particular pixel of an image and its
normality model, calculated in terms of mean and variance of past readings of
the same spatial region of the image. While Eccentricity maps carry temporal
information about the scene, actual images do not need to be stored nor
processed in batches. Rather, all the calculations are done recursively, based
on a small amount of statistical information stored in memory, thus resulting
in a very computationally efficient (processor- and memory-wise) method. The
list of potential applications includes video-based activity recognition,
intent recognition, object tracking, video description, and so on.
Related papers
- Deep scene-scale material estimation from multi-view indoor captures [9.232860902853048]
We present a learning-based approach that automatically produces digital assets ready for physically-based rendering.
Our method generates approximate material maps in a fraction of time compared to the closest previous solutions.
arXiv Detail & Related papers (2022-11-15T10:58:28Z) - Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - ImPosIng: Implicit Pose Encoding for Efficient Camera Pose Estimation [2.6808541153140077]
Implicit Pose.
(ImPosing) embeds images and camera poses into a common latent representation with 2 separate neural networks.
By evaluating candidates through the latent space in a hierarchical manner, the camera position and orientation are not directly regressed but refined.
arXiv Detail & Related papers (2022-05-05T13:33:25Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - A Novel Upsampling and Context Convolution for Image Semantic
Segmentation [0.966840768820136]
Recent methods for semantic segmentation often employ an encoder-decoder structure using deep convolutional neural networks.
We propose a dense upsampling convolution method based on guided filtering to effectively preserve the spatial information of the image in the network.
We report a new record of 82.86% and 81.62% of pixel accuracy on ADE20K and Pascal-Context benchmark datasets, respectively.
arXiv Detail & Related papers (2021-03-20T06:16:42Z) - Event-based Motion Segmentation with Spatio-Temporal Graph Cuts [51.17064599766138]
We have developed a method to identify independently objects acquired with an event-based camera.
The method performs on par or better than the state of the art without having to predetermine the number of expected moving objects.
arXiv Detail & Related papers (2020-12-16T04:06:02Z) - Cross-Descriptor Visual Localization and Mapping [81.16435356103133]
Visual localization and mapping is the key technology underlying the majority of Mixed Reality and robotics systems.
We present three novel scenarios for localization and mapping which require the continuous update of feature representations.
Our data-driven approach is agnostic to the feature descriptor type, has low computational requirements, and scales linearly with the number of description algorithms.
arXiv Detail & Related papers (2020-12-02T18:19:51Z) - Self-supervised Video Representation Learning by Uncovering
Spatio-temporal Statistics [74.6968179473212]
This paper proposes a novel pretext task to address the self-supervised learning problem.
We compute a series of partitioning-temporal statistical summaries, such as the spatial location and dominant direction of the largest motion.
A neural network is built and trained to yield the statistical summaries given the video frames as inputs.
arXiv Detail & Related papers (2020-08-31T08:31:56Z) - Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision.
We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.