Related papers: PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation

PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation

URL: http://arxiv.org/abs/2406.09726v1
Date: Fri, 14 Jun 2024 05:28:45 GMT
Title: PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation
Authors: Ignacio Alzugaray, Riku Murai, Andrew Davison,
Abstract summary: In this paper, we address the task of frame-to-frame rotational estimation. Instead of reasoning about relative motion between frames using the full images, distribute the estimation at pixel-level. In this paradigm, each pixel produces an estimate of the global motion by only relying on local information and local message-passing with neighbouring pixels.
Score: 8.049531918823758
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Visual sensors are not only becoming better at capturing high-quality images but also they have steadily increased their capabilities in processing data on their own on-chip. Yet the majority of VO pipelines rely on the transmission and processing of full images in a centralized unit (e.g. CPU or GPU), which often contain much redundant and low-quality information for the task. In this paper, we address the task of frame-to-frame rotational estimation but, instead of reasoning about relative motion between frames using the full images, distribute the estimation at pixel-level. In this paradigm, each pixel produces an estimate of the global motion by only relying on local information and local message-passing with neighbouring pixels. The resulting per-pixel estimates can then be communicated to downstream tasks, yielding higher-level, informative cues instead of the original raw pixel-readings. We evaluate the proposed approach on real public datasets, where we offer detailed insights about this novel technique and open-source our implementation for the future benefit of the community.

Related papers

PixelWorld: Towards Perceiving Everything as Pixels [50.13953243722129]
We propose to unify all modalities (text, tables, code, diagrams, images, etc) as pixel inputs, i.e. "Perceive Everything as Pixels" (PEAP) We introduce PixelWorld, a novel evaluation suite that unifies all the mentioned modalities into pixel space to gauge the existing models' performance.
arXiv Detail & Related papers (2025-01-31T17:39:21Z)
Parameter-Inverted Image Pyramid Networks [49.35689698870247]
We propose a novel network architecture known as the Inverted Image Pyramid Networks (PIIP) Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid. PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification.
arXiv Detail & Related papers (2024-06-06T17:59:10Z)
Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
Superpixel Transformers for Efficient Semantic Segmentation [32.537400525407186]
We propose a solution by leveraging the idea of superpixels, an over-segmentation of the image, and applying them with a modern transformer framework. Our method achieves state-of-the-art performance in semantic segmentation due to the rich superpixel features generated by the global self-attention mechanism.
arXiv Detail & Related papers (2023-09-28T23:09:30Z)
Learn how to Prune Pixels for Multi-view Neural Image-based Synthesis [10.571582038258443]
We present LeHoPP, a method for input pixel pruning. We examine the importance of each input pixel concerning the rendered view, and we avoid the use of irrelevant pixels. Even without retraining the image-based rendering network, our approach shows a good trade-off between synthesis quality and pixel rate.
arXiv Detail & Related papers (2023-05-05T14:29:24Z)
Probabilistic Deep Metric Learning for Hyperspectral Image Classification [91.5747859691553]
This paper proposes a probabilistic deep metric learning framework for hyperspectral image classification. It aims to predict the category of each pixel for an image captured by hyperspectral sensors. Our framework can be readily applied to existing hyperspectral image classification methods.
arXiv Detail & Related papers (2022-11-15T17:57:12Z)
Enhancing Multi-Scale Implicit Learning in Image Super-Resolution with Integrated Positional Encoding [4.781615891172263]
We consider each pixel as the aggregation of signals from a local area in an image super-resolution context. We propose integrated positional encoding (IPE), extending traditional positional encoding by aggregating frequency information over the pixel area. We show the effectiveness of IPE-LIIF by quantitative and qualitative evaluations, and further demonstrate the generalization ability of IPE to larger image scales.
arXiv Detail & Related papers (2021-12-10T06:09:55Z)
A Novel Upsampling and Context Convolution for Image Semantic Segmentation [0.966840768820136]
Recent methods for semantic segmentation often employ an encoder-decoder structure using deep convolutional neural networks. We propose a dense upsampling convolution method based on guided filtering to effectively preserve the spatial information of the image in the network. We report a new record of 82.86% and 81.62% of pixel accuracy on ADE20K and Pascal-Context benchmark datasets, respectively.
arXiv Detail & Related papers (2021-03-20T06:16:42Z)
AINet: Association Implantation for Superpixel Segmentation [82.21559299694555]
We propose a novel textbfAssociation textbfImplantation (AI) module to enable the network to explicitly capture the relations between the pixel and its surrounding grids. Our method could not only achieve state-of-the-art performance but maintain satisfactory inference efficiency.
arXiv Detail & Related papers (2021-01-26T10:40:13Z)
Personal Privacy Protection via Irrelevant Faces Tracking and Pixelation in Video Live Streaming [61.145467627057194]
We develop a new method called Face Pixelation in Video Live Streaming to generate automatic personal privacy filtering. For fast and accurate pixelation of irrelevant people's faces, FPVLS is organized in a frame-to-video structure of two core stages. On the video live streaming dataset we collected, FPVLS obtains satisfying accuracy, real-time efficiency, and contains the over-pixelation problems.
arXiv Detail & Related papers (2021-01-04T16:18:26Z)
A U-Net Based Discriminator for Generative Adversarial Networks [86.67102929147592]
We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs) The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images. The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
arXiv Detail & Related papers (2020-02-28T11:16:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.