PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation
- URL: http://arxiv.org/abs/2406.09726v2
- Date: Sun, 24 Aug 2025 10:29:45 GMT
- Title: PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation
- Authors: Ignacio Alzugaray, Riku Murai, Andrew Davison,
- Abstract summary: We propose a novel photometric rotation estimation algorithm to be distributed at pixel level.<n>Each pixel estimates the global motion of the camera by exchanging information with other pixels to achieve global consensus.
- Score: 12.942063363292888
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Images are the standard input for most computer vision algorithms. However, their processing often reduces to parallelizable operations applied locally and independently to individual pixels. Yet, many of these low-level raw pixel readings only provide redundant or noisy information for specific high-level tasks, leading to inefficiencies in both energy consumption during their transmission off-sensor and computational resources in their subsequent processing. As novel sensors featuring advanced in-pixel processing capabilities emerge, we envision a paradigm shift toward performing increasingly complex visual processing directly in-pixel, reducing computational overhead downstream. We advocate for synthesizing high-level cues at the pixel level, enabling their off-sensor transmission to directly support downstream tasks more effectively than raw pixel readings. This paper conceptualizes a novel photometric rotation estimation algorithm to be distributed at pixel level, where each pixel estimates the global motion of the camera by exchanging information with other pixels to achieve global consensus. We employ a probabilistic formulation and leverage Gaussian Belief Propagation (GBP) for decentralized inference using messaging-passing. The proposed proposed technique is evaluated on real-world public datasets and we offer a in-depth analysis of the practicality of applying GBP to distributed rotation estimation at pixel level.
Related papers
- Exploring Kernel Transformations for Implicit Neural Representations [57.2225355625268]
Implicit neural representations (INRs) leverage neural networks to represent signals by mapping coordinates to their corresponding attributes.<n>This work pioneers the exploration of the effect of kernel transformation of input/output while keeping the model itself unchanged.<n>A byproduct of our findings is a simple yet effective method that combines scale and shift to significantly boost INR with negligible overhead.
arXiv Detail & Related papers (2025-04-07T04:43:50Z) - PixelWorld: Towards Perceiving Everything as Pixels [50.13953243722129]
We propose to unify all modalities (text, tables, code, diagrams, images, etc) as pixel inputs, i.e. "Perceive Everything as Pixels" (PEAP)
We introduce PixelWorld, a novel evaluation suite that unifies all the mentioned modalities into pixel space to gauge the existing models' performance.
arXiv Detail & Related papers (2025-01-31T17:39:21Z) - Parameter-Inverted Image Pyramid Networks [49.35689698870247]
We propose a novel network architecture known as the Inverted Image Pyramid Networks (PIIP)
Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid.
PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification.
arXiv Detail & Related papers (2024-06-06T17:59:10Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Superpixel Transformers for Efficient Semantic Segmentation [32.537400525407186]
We propose a solution by leveraging the idea of superpixels, an over-segmentation of the image, and applying them with a modern transformer framework.
Our method achieves state-of-the-art performance in semantic segmentation due to the rich superpixel features generated by the global self-attention mechanism.
arXiv Detail & Related papers (2023-09-28T23:09:30Z) - Learn how to Prune Pixels for Multi-view Neural Image-based Synthesis [10.571582038258443]
We present LeHoPP, a method for input pixel pruning.
We examine the importance of each input pixel concerning the rendered view, and we avoid the use of irrelevant pixels.
Even without retraining the image-based rendering network, our approach shows a good trade-off between synthesis quality and pixel rate.
arXiv Detail & Related papers (2023-05-05T14:29:24Z) - Probabilistic Deep Metric Learning for Hyperspectral Image
Classification [91.5747859691553]
This paper proposes a probabilistic deep metric learning framework for hyperspectral image classification.
It aims to predict the category of each pixel for an image captured by hyperspectral sensors.
Our framework can be readily applied to existing hyperspectral image classification methods.
arXiv Detail & Related papers (2022-11-15T17:57:12Z) - Enhancing Multi-Scale Implicit Learning in Image Super-Resolution with
Integrated Positional Encoding [4.781615891172263]
We consider each pixel as the aggregation of signals from a local area in an image super-resolution context.
We propose integrated positional encoding (IPE), extending traditional positional encoding by aggregating frequency information over the pixel area.
We show the effectiveness of IPE-LIIF by quantitative and qualitative evaluations, and further demonstrate the generalization ability of IPE to larger image scales.
arXiv Detail & Related papers (2021-12-10T06:09:55Z) - A photosensor employing data-driven binning for ultrafast image
recognition [0.0]
Pixel binning is a technique widely used in optical image acquisition and spectroscopy.
Here, we push the concept of binning to its limit by combining a large fraction of the sensor elements into a single superpixel.
For a given pattern recognition task, its optimal shape is determined from training data using a machine learning algorithm.
arXiv Detail & Related papers (2021-11-20T15:38:39Z) - A Novel Upsampling and Context Convolution for Image Semantic
Segmentation [0.966840768820136]
Recent methods for semantic segmentation often employ an encoder-decoder structure using deep convolutional neural networks.
We propose a dense upsampling convolution method based on guided filtering to effectively preserve the spatial information of the image in the network.
We report a new record of 82.86% and 81.62% of pixel accuracy on ADE20K and Pascal-Context benchmark datasets, respectively.
arXiv Detail & Related papers (2021-03-20T06:16:42Z) - AINet: Association Implantation for Superpixel Segmentation [82.21559299694555]
We propose a novel textbfAssociation textbfImplantation (AI) module to enable the network to explicitly capture the relations between the pixel and its surrounding grids.
Our method could not only achieve state-of-the-art performance but maintain satisfactory inference efficiency.
arXiv Detail & Related papers (2021-01-26T10:40:13Z) - Personal Privacy Protection via Irrelevant Faces Tracking and Pixelation
in Video Live Streaming [61.145467627057194]
We develop a new method called Face Pixelation in Video Live Streaming to generate automatic personal privacy filtering.
For fast and accurate pixelation of irrelevant people's faces, FPVLS is organized in a frame-to-video structure of two core stages.
On the video live streaming dataset we collected, FPVLS obtains satisfying accuracy, real-time efficiency, and contains the over-pixelation problems.
arXiv Detail & Related papers (2021-01-04T16:18:26Z) - Fully Embedding Fast Convolutional Networks on Pixel Processor Arrays [16.531637803429277]
We present a novel method of CNN inference for pixel processor array ( PPA) vision sensors.
Our approach can perform convolutional layers, max pooling, ReLu, and a final fully connected layer entirely upon the PPA sensor.
This is the first work demonstrating CNN inference conducted entirely upon the processor array of a PPA vision sensor device, requiring no external processing.
arXiv Detail & Related papers (2020-04-27T01:00:35Z) - A U-Net Based Discriminator for Generative Adversarial Networks [86.67102929147592]
We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs)
The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images.
The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
arXiv Detail & Related papers (2020-02-28T11:16:54Z) - Robust superpixels using color and contour features along linear path [5.746869663956391]
We propose a framework that provides accurate and regular Superpixels with Contour Adherence using Linear Path (SCALP)<n>A contour prior is also used to prevent the crossing of image boundaries when associating a pixel to a superpixel.<n>SCALP is extensively evaluated on standard segmentation dataset, and the obtained results outperform the ones of the state-of-the-art methods.
arXiv Detail & Related papers (2019-03-17T23:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.